Next Article in Journal
Drone-Sensed and Sap Flux-Derived Leaf Phenology in a Cool Temperate Deciduous Forest: A Tree-Level Comparison of 17 Species
Previous Article in Journal
Retrieving Mediterranean Sea Surface Salinity Distribution and Interannual Trends from Multi-Sensor Satellite and In Situ Data
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Assessment of the Usefulness of Spectral Bands for the Next Generation of Sentinel-2 Satellites by Reconstruction of Missing Bands

CESBIO, Université de Toulouse, CNES/CNRS/IRD/UPS, 31401 Toulouse, France
*
Author to whom correspondence should be addressed.
Remote Sens. 2022, 14(10), 2503; https://doi.org/10.3390/rs14102503
Submission received: 26 April 2022 / Revised: 14 May 2022 / Accepted: 19 May 2022 / Published: 23 May 2022
(This article belongs to the Section Satellite Missions for Earth and Planetary Exploration)

Abstract

:
The Sentinel-2 constellation has been providing high spatial, spectral and temporal resolution optical imagery of the continental surfaces since 2015. The spatial and temporal resolution improvements that Sentinel-2 brings with respect to previous systems have been demonstrated in both the literature and operational applications. On the other hand, the spectral capabilities of Sentinel-2 appear to have been exploited to a limited extent only. At the moment of definition of the new generation of Sentinel-2 satellites, an assessment of the usefulness of the current available spectral bands seems appropriate. In this work, we investigate the unique information contained by each 20 m resolution Sentinel-2 band. A statistical quantitative approach is adopted in order to yield conclusions that are application agnostic: multivariate regression is used to reconstruct some bands, using the others as predictors. We conclude that, for most observed surfaces, it is possible to reconstruct the reflectances of most red edge or NIR bands from the rest of the observed bands with an accuracy within the radiometric requirements of Sentinel-2. Removing two of those bands could be possible at the cost of slightly higher reconstruction errors. We also identify mission scenarios for which several of the current Sentinel-2 bands could be removed for the next generation of sensors.

Graphical Abstract

1. Introduction

The Sentinel-2 constellation constitutes a revolution in remote sensing in terms of data quantity, quality and availability. The high spatial and temporal resolutions of Sentinel-2 [1] have been demonstrated to be crucial for many applications that have been reported in the scientific literature and validated by operational applications covering a wide range of use cases, such as land-cover mapping, snow-extent mapping, biophysical parameter estimation, agriculture monitoring, etc.
Sentinel-2 provides 13 spectral bands with spatial resolutions from 10 m to 60 m and a 5-day revisit cycle.
The particularities of Sentinel-2 with respect to pre-existing comparable systems are:
  • in the temporal domain, a systematic acquisition plan (unlike tasked satellites, which acquire scenes on demand) with a high revisit frequency (5 days compared to the 16 days of Landsat);
  • in the spatial domain, a higher resolution than Landsat (10 m to 20 m compared to 30 m);
  • in the spectral domain, an increased number of bands with respect to both the classical blue, green, red, NIR band set and Landsat (4 visible, 1 NIR, 2 SWIR), with the novelty of 3 red edge (RE) bands, although a lack of thermal band with respect to Landsat.
However, as we show in Section 1.4, very few published works have made full use of the spectral richness of Sentinel-2, and often these uses have not been demonstrated to be the only way to extract the target information.
After 5 years of operation, the work on the new generation of Sentinel-2 satellites (S2NG) has started, and one of the tasks is to identify the set of spectral bands. The question of «Which additional spectral bands could be put on board S2NG?» has to be balanced with the one of the «S2 possible useless bands», that is, the current available bands which could be removed for S2NG. Adding spectral bands to a satellite bears a cost, which could impact the trade-off with other mission requirements, such as temporal revisit needing an additional satellite.
Of course, all current Sentinel-2 bands contain potentially useful information, since they sample different intervals of the electromagnetic spectrum, and, except for the pair B8-B8A (see Section 1.1), there is no significant overlap between the different spectral ranges. However, since there exists a high level of redundancy in the underlying observed nature, one can expect high degrees of correlation between the different bands, allowing us to question the true usefulness of some bands.
With 5 years of data collection and exploitation, it is now possible to quantitatively assess the usefulness of the different bands on board Sentinel-2. This could be done in terms of the quality of the results of downstream processing (biophysical parameter estimation, land-cover mapping, etc.), but this would need to address a huge number of application domains with experiments and validation data without the guarantee of exhaustivity or chances of replicability.
On the other hand, if we address the problem from the information content point of view, we only have to deal with data at the sensor level. We therefore choose to pose the problem as a data reconstruction one: if one band can be reconstructed—within a predefined error margin—from the other bands, it can be removed from the satellite without quantitative loss of information.
One could argue that what matters is the estimation of physical parameters and that imperfect reconstruction of a particular band can have no impact for many applications. This would allow going farther in terms of spectral band removal. We agree with this point of view, but all downstream processing entails the use of (imperfect by construction) models, and the closer we get to the sensor, the most application-independent the conclusions of the study will be.
The aim of this paper is to leverage this interband correlation and assess which bands could be removed from future iterations of the Sentinel-2 constellation with a minimal impact on the usefulness of the acquired data. To do so, we predict the reflectances of missing bands with nonlinear regression algorithms that use the other spectral bands as predictors. In order to produce results that are representative of real settings and are generalizable, we build a dataset by sampling pixels from Sentinel-2 acquisitions with a wide variety of geographic areas and dates. We therefore take an empirical, data-driven approach.
We choose not to leverage the spatial and the temporal dimensions and carry out a mono-date, pixel-based analysis. We understand that temporal and spatial correlations would reduce the errors in the reconstruction of missing bands. The goal of the work is not to propose an optimal regression algorithm, but rather to show that band reconstruction is possible using regression. The results of this work can be seen as a lower bound in terms of reconstruction quality and therefore encourage the pursuit of further studies.

1.1. The Sentinel-2 Spectral Bands

Table 1 gives the name and the central wavelength for each band acquired by Sentinel-2. There are four bands at 10 m resolution: the three usual visible bands (B2–B4) and a wide NIR band (B8). The 20 m resolution bands are three narrow bands in the red edge (B5-B7), one narrow NIR (B8A) and two SWIR bands (B11, B12). Finally, the 60 m resolution bands are aimed at radiometric corrections (B1 for aerosol content estimation, B9 for water vapor and B10 for cirrus detection). Figure 1 illustrates the relative spectral responses of the 10 m, 20 m and 60 m resolution bands.

1.2. S2 Radiometric Requirements

The Sentinel-2 Mission Requirements Document (MRD) [2] states that for the applications covered by this mission, the radiometric accuracy at top of atmosphere (TOA) has to be not worse than 3% (goal) to 5% (threshold). For inter-band radiometric calibration, 3% accuracy is also required.
These requirements allow the definition of error bounds for the band reconstruction tasks that we assess in this work. For TOA reflectances, we can aim for the 3% reconstruction error. In terms of surface reflectance, the accuracy of the MACCS-ATCOR Joint Algorithm (MAJA) processor [3,4] is 0.01 (not in %, but in reflectance values), and we can use this value as the requirement.
Given the fact that there are other errors in the measure (geometric registration between bands, Modulation Transfer Function (MTF) differences, etc.), achieving these error bounds can be considered rather ambitious.
Other approaches to define the reconstruction requirements could be used. For instance, [5] presents a radiometric uncertainty tool which allows estimation of the radiometric uncertainty associated with each pixel of a Sentinel-2 image in the TOA images provided by ESA. The approach integrates all the errors from the TOA reflectance to the L1C product, and typical values are greater than 10% for open sea, 5% to 15% in rice fields covered by water and 2% to 4% for land areas. We see that the 3% specification is very strict.

1.3. Directional Effects

Since the reflectance of surfaces depends on the observation and illumination directions [6], particular attention has to be payed to the acquisition geometry. Directional effects are especially important in (nearly) specular reflections, but also in the case of shadow or volume effects.
The MultiSpectral Instrument (MSI) is composed of two focal planes covering the VNIR and the SWIR channels, respectively, each one having an array of 12 detectors. Due to the shifted positioning of the detectors along the track direction on the focal planes, angular differences between the two alternating odd and even clusters of detectors are induced in the measurements. The parallax base/height (B/H) ratio ranges between 0.022 and 0.059. A similar issue occurs between the VNIR and SWIR detectors, resulting in an inter-band B/H which is less than 0.01 for the VNIR channels and less than 0.018 for the SWIR.
The values of the solar and sensor angles on a 5 km grid are provided in the L1C product metadata. We leverage this information in the band reconstruction algorithms that will be used in this work.

1.4. Specific Uses of S2 Bands

The spectral bands of Sentinel-2 allow the computation of a large variety of spectral indices other than NDVI that are useful for different applications. Table 2 presents a selection of several of them.
The RE bands have been proposed for chlorophyll estimation, burn severity assessment [7], LAI estimation [8] and non photosynthetic vegetation [9]. The SWIR bands have been proposed for dry mass vegetation [10] and water or moisture indices [11].
Although a thorough review of the literature is out of the scope of this paper, a bibliometric analysis shows that very few papers published after the launch of Sentinel-2 make an explicit use of the spectral particularities (RE and SWIR bands). Furthermore, a recent review about phenology monitoring using Sentinel-2 [12] shows that only one out of four published papers uses spectral information other than NDVI.
Some studies, for instance [13], claim that RE and SWIR bands during vegetation senescence appear to be important for machine learning-based classification. The concept of importance has to be nuanced, since it measures the errors made when the reflectance of those bands are replaced by random values. In order to have an accurate assessment of the usefulness of those variables, the classifiers should be retrained without them. On the other hand, the same work shows that PSRI, which is computed from red, green and NIR, is also important, which may indicate a high correlation (and therefore redundancy) with RE bands.
Another work supporting the interest of RE and SWIR bands is [16], where they are shown to be useful for gross primary productivity estimation in grasslands. Using regression approaches, the authors show that those bands are useful to predict the target variable, but do not study whether by using more complex regressors the error without those bands could be reduced.
It is interesting to note that other works, for instance [17], show that NDVI is best suited to monitor grass phenology rather than more sophisticated VIs using RE and SWIR bands. Another example is [18], where it is shown that the RE bands of Sentinel-2 do not perform well for the estimation of chlorophyll content changes in certain crops. One should note that before the launch of Sentinel-2 the same community had great expectations for these bands for the same application [19]. However, at the time, the authors already suggested that using the green band in C I g r e e n also seemed very promising and therefore further research was required.
The apparent contradictions between these different works are likely due to the fact that different experimental settings, different data and different applications were involved.
Further, we find that works on the usefulness of spectral bands are usually addressed only from the point of view of demonstrating that a particular phenomenon has a signature in a particular band. For instance, a recent publication [20] proposed additional SWIR bands in order to detect non-photosynthetic vegetation and crop residues. The study indeed shows that these objects cannot be detected with the SWIR bands of Landsat-8. However the cited work does not analyze how the complete set of Landsat-8 bands could be used to retrieve a signature of the phenomenon at hand.
As of this writing and to the best of our knowledge, the most thorough study of the usefulness of Sentinel-2’s spectral bands is [21]. This reference is actually a detailed literature review of the use of hyperspectral imagery with the goal of proposing synergies with Sentinel-2 in order to overcome the limitations of space-borne hyperspectral sensors (spatial resolution, revisit time and signal-to-noise ratio). Interestingly, the review shows how the current set of Sentinel-2 bands constitutes in itself a very wise choice for many applications. However, the limit of such a meta-analysis is that there cannot be a holistic view of the problem, since the pertinence of each spectral range is performed in isolation in the reviewed literature. Indeed, this prevents discovering redundancies between different bands. For instance, this reference excludes uses for geological and lithological mapping, such as [22,23,24], where the higher resolution of Sentinel-2’s NIR bands is assessed for the estimation of iron oxides.
We think this supports the idea of performing a purely data-driven approach over a large dataset and with an application-agnostic point of view. However, the work presented in this paper is just a modest demonstration of what could be done by exploiting the existing Sentinel-2 archive.
Finally, we will stress again the fact that we do not claim that some Sentinel-2 bands do not contain useful information. We want to assess the possibility of reconstructing this information by leveraging redundancies among the complete set of spectral bands. This reconstruction will, of course, contain errors, and the goal here is to give bounds allowing informed design trade-offs for future systems.

2. Materials and Methods

2.1. Data Preparation

For this study, a set of 128 Sentinel-2 tiles was used. Figure 2 illustrates the geographic distribution of these tiles. For each tile, a single date was used and the selection was random for the period from early 2016 to the end of 2020. The goal was to cover a wide range of geographic areas and seasons. For each acquisition, the data were obtained at two processing levels: 1C (from PEPS, CNES mirror of Sentinel data) and 2A (from Theia’s catalog ), the latter having been produced by the MAJA processor. This allows us to use accurate masks at the pixel level for clouds, cloud shadows and saturation effects.
For each acquisition, 100,000 pixels were sampled. Only non-saturated pixels were selected, regardless of their cloud or shadow status. Pixel positions were selected on the 20 m resolution grid. For each 20 m pixel position, the following information was recorded:
  • whether the pixel was detected as a cloud or a shadow (without distinction between these two states),
  • the reflectance in the 20 m bands for levels 1C and 2A,
  • the reflectance of the four corresponding pixels of each of the 10 m resolution bands for levels 1C and 2A,
  • the reflectance at the 20 m pixel position of the 60 m resolution bands after bicubic resampling for level 1C,
  • the solar and viewing angles for each pixel.
For the analyses performed in the following sections, we split the data at the tile level. This means that all the pixels used for testing (measures of accuracy of the reconstructions) belong to tiles for which no pixel was used for training or even intermediate validation.
In the experiments carried out in this work, we randomly select 100 tiles and do a 80/20% split at the tile level for training and testing purposes. This means that training and testing pixels come from different tiles and dates. The training set is further split into proper training samples (80%) and validation samples (20%), the latter being used for monitoring the convergence of the training. For each experiment (i.e., set of predicted bands and set of predictor bands), the experiment is repeated 10 times by selecting a different set of 100 tiles among the 128 available. This allows checking for possible selection biases and allows further assessment of the robustness of the regressions.
Further, only clear pixels (non-cloudy and not shadow) are used for training and validating models. This reduces the number of available pixels. On average, each experiment uses 3.86928 × 106 training samples, 967,320 validation samples and 1.2582 × 106 testing samples and is repeated 10 times.
The dataset has been made public [25] and is available for other researchers to reproduce and improve the work presented in this paper.

2.2. Regression Model

As stated in the introduction, we aim at estimating a subset of the Sentinel-2 bands from the other ones. This estimation will be done using regression techniques. The regression algorithms will be calibrated and validated using the data described in Section 2.1. In this section, we describe the regression approach chosen.

2.2.1. Reflectance Estimation with Associated Uncertainties

The regression problem is posed as the estimation of one or several spectral bands as a nonlinear combination of a disjointed set of the available bands. For the prediction of a single-band, this can be written as:
ρ i ^ = f ( { ρ j i } , θ ) ,
that is, prediction of reflectance of band i is a function of the measured reflectances of the other bands and a set of parameters θ containing other pertinent information, such as solar and sensor angles. The regression can jointly estimate several spectral bands in a set I:
{ ρ i ^ } i I = f ( { ρ j } j I , θ )
The regression procedure should also produce a credibility interval of the estimation of the target variable. (We use the term credibility interval instead of confidence interval because we adopt a Bayesian point of view: we consider the estimated value is a random variable and the bounds of the interval are fixed, while the use of confidence intervals considers the bounds as random variables that result from repeated measures.) In order to do this, instead of regressing over the expected mean, we can implement a regression of the mean and the variance of the target variable. Estimating a mean and a variance means that we are assuming a Gaussian error model.
At inference (estimation) time, the mean will be used for variable estimation (in a Gaussian model the mean is the value with the highest probability), and the variance will be used to give the credibility interval.
Given a target value y (in our case ρ i ) and the estimates μ ^ and σ ^ , the predictive likelihood of the target value given the estimates is the Gaussian distribution whose probability density function is
p ( y | μ ^ , σ ^ ) = 1 2 π σ ^ 2 e ( y μ ^ ) 2 2 σ ^ 2
We can therefore pose the regression problem as the minimization of a cost function given by the negative log-likelihood [26]. The log-likelihood takes the form:
l o g p ( y | μ ^ , σ ^ ) = 1 2 log ( 2 π ) 1 2 log σ ^ 2 ( y μ ^ ) 2 2 σ ^ 2
Therefore, after removing the constant term and a multiplicative factor, the cost function to be minimized is:
L = i log σ ^ i 2 + ( y i μ ^ i ) 2 σ ^ i 2
where the sum is taken over the training samples.
Beyond being the correct theoretical loss under a Gaussian error model, this penalty function can be interpreted as follows :
  • the term ( y i μ ^ i ) 2 penalizes the errors between the target value and the estimated mean;
  • these errors are weighted by the uncertainty estimation: larger errors will need larger values of σ i to lower the penalty;
  • in order to avoid allowing large errors on μ i by always estimating large values of σ i , large values of σ i are also penalized by the first term in the loss.

2.2.2. Regression Algorithm

The regression algorithm will have to find the approximation of function f in Equation (1) that minimizes the cost function described above. Since we don’t have prior knowledge of the shape of f, we choose to use a non-parametric approach. Among non-parametric algorithms for regression, feed-forward neural networks (multi-layer perceptrons, MLPs) seem a good choice because they are universal function approximators [27] that can be used in a multi-variate input and output setting and with custom cost functions. Conversely, other choices have limitations. For instance, linear and logistic regressions impose a strong prior on the shape of f, and random forest regression cannot predict several targets. The main drawback of neural networks is their lack of interpretability.
MLPs are composed of fully connected linear layers (sets of neurons computing linear combinations of the inputs) followed by nonlinearities ϕ called activation functions. Figure 3 illustrates an MLP with a single hidden layer with five neurons. A large number of layers with different numbers of neurons can be used. Training such a network consists in finding the set of weights wi that minimize the loss function for the set of training samples. Optimization is performed by stochastic gradient descent.
Another interesting property of MLPs is that they can be combined as elementary bricks in more complex architectures. This allows the introduction of some structure in the processing, which brings interpretability and the possibility of introducing some prior knowledge. We will develop this point in the next section.

2.2.3. Network Architecture

As stated above, the regression neural network will estimate the reflectances of the target bands using the reflectances of the other bands as predictors. All computations are performed for individual pixels. In order to take into account BRDF effects, the solar and sensor angles (both azimuth and zenith, as described in Section 1.3) are also used as predictors. More precisely, the sines and cosines of each angle are used.
Instead of using all predictors (reflectances and angles) together in a flat vector as input for an MLP as in Figure 3, we use an attention mechanism where the angular information modulates the spectral values. This is implemented as illustrated in Figure 4. First, the spectral and angular information are fed to the Angular MLP which is used to generate an attention mask. An attention mask is a vector of real numbers in [0, 1] with the same number of components as the data on which the attention is being applied. In our case, this is the vector of spectral bands. The Angular MLP is a standard MLP with a single hidden layer containing eight neurons and a SoftMax layer as output. The SoftMax function is an exponential normalization that maps a set of values to the unit interval (simplex in more than one dimension) σ : R K [ 0 , 1 ] K and is defined by:
σ ( z ) i = e z i j = 1 K e z j for i = 1 , , K and z = ( z 1 , , z K ) R K ,
where zi are the outputs of the layer preceding the SoftMax.
Therefore, the Angular MLP learns a set of multiplicative weights (this operation is represented by the ⊗ symbol in Figure 4) that will be applied to the input reflectances in order to perform an angular correction. It is interesting to note that this angular correction takes into account the spectral information itself, that is, the reflectances and the angles are both used to estimate the correction. It is therefore a kind of self-attention mechanism [28].
A residual connection (a simple, elementwise addition, represented by ⊕ in Figure 4) is used after the attention mask in order to keep spectral information that could be excessively removed by the attention mechanism before entering the Backbone MLP. The latter is used to embed the predictors into a feature space that will be used to feed the two modules used to estimate the target values and their uncertainties, respectively.
The backbone part (a three-hidden-layer MLP with 10 neurons per layer) allows modeling of the correlation between the target variables and their uncertainties. The independent MLP branches (with the same architecture as the backbone) for μ and σ get specialized into the estimation of each set of information. Performing the regression for several target variables with the same network is a kind of multitask learning that is able to leverage the correlation between target variables and is more efficient than preforming single target regressions.
For numerical stability and positivity constraints, instead of estimating the σ or σ 2 , we estimate log σ .
The output activation functions for the mean and the variance estimations are hyperbolic tangents so that the values are contained in the [−1, 1] interval. The output value is then rescaled into a predefined interval, [−0.2, 1.3] for μ and [1 × 10−5, 1.5] for σ 2 . The rescaling for μ allows taking into account the fact that L2A reflectances can sometimes be negative due to over-correction. Reflectances can also be higher than one in specular conditions. The rescaling intervals could be learned from the data, but we set them for simplicity.
The regression of several bands simultaneously is done by a straightforward extension of the single target case. The output layers for both the means and the variances will have as many neurons as target variables. The loss function is just the sum of the losses for each target variable.
The network is trained for 100 epochs using an Adam optimizer [29] with a learning rate of 0.001 and a batch size of 256.

2.3. Measuring Redundancies in Sentinel-2 Bands

To assess the quality of the spectral regression approaches, we will analyze the statistical dependence between all the pairs of Sentinel-2 bands. Instead of measuring correlations, which are limited to linear (Pearson correlation) or monotonic (Spearman correlation) dependencies, we will use the mutual information, I. It measures dissimilarity between the joint distribution of a pair of variables and the product of the marginals. It is therefore a measure of the distance to general statistical independence:
I ( X ; Y ) = D K L ( P ( X , Y ) P X P Y ) ,
where DKL is the Kullback–Leibler divergence. The mutual information can also be written in terms of entropies (H) as follows:
I ( X ; Y ) = H ( X , Y ) H ( X | Y ) H ( Y | X ) = H ( Y ) H ( Y | X ) = H ( X ) H ( X | Y ) ,
and it is therefore a measure of the amount of uncertainty about one variable once the other is known. The mutual information is positive, but it is not upper-bounded. Therefore, we use a normalized version using the entropies of each variable:
I n o r m ( X ; Y ) = I ( X ; Y ) H ( X ) H ( Y )
We will study this measure for both L1C and L2A data.

3. Results

3.1. Redundancies in Sentinel-2 Bands

As stated in Section 2.3, we start by analyzing the redundancies in Sentinel-2 spectral bands. Figure 5 displays the values of the normalized mutual information correlation for all pairs of bands of L1C (left) and L2A (right) data.
Both levels of processing show the same patterns and nearly the same values, although L2A has slightly lower values of dependence. This may indicate that atmospheric corrections are able to remove effects with high correlation across bands.
We observe high values for the red edge bands between B5 and the red band, and between the two SWIR bands. Interestingly, B5 presents a relatively low dependence with respect to B6 and B7, and there is very small redundancy between B8 and B8A (it is, for instance, lower than between green and B5).
The highest values of mutual information are obtained between adjacent bands of the B6, B7, B8A triplet; B7 being the most similar to the others. B7 therefore seems a good candidate for reconstruction from other bands.
One limitation of this analysis is that only pairs of bands are compared, and therefore it is impossible to assess if the redundancies between, for instance, B7 and B6 are complementary to those between B7 and B8A, which would allow better reconstruction of B7 from the other two than if these redundancies were the same.
It is also interesting to note that B5 has all values higher than 0.4 (except for B8), which may indicate either the possibility of reconstructing it from the other bands, or conversely, of it being some sort of pivotal band to reconstruct the others.
The relatively low value of the mutual information between B8 and B8A may seem surprising since the latter is a subset of the former. Actually, this value is the same for B7 and B8, which are adjacent (see Figure 6). However, B8A has a width less than 20% of that of B8.
This means that these measures of mutual information are lower bounds of the amount of information that could be reconstructed from other bands.

3.2. Single Band Regression

We present in this section the performance of the reconstruction of each spectral band by applying the neural network regression algorithm described in Section 2.2. As stated before, only the 20 m bands are reconstructed, and the following data are used as predictors:
  • the sines and cosines of the four observation angles,
  • all 20 m bands except for the target one,
  • the values of the four 10 m pixels for B2, B3, B4 and B8 associated with the 20 m target pixel,
  • and, only for L1C, the value of the three 60 m bands interpolated (with a bicubic interpolator) to the coordinate of the center of the 20 m pixel.
Each regression case is repeated 10 times using the protocol described at the end of Section 2.1.

3.2.1. Analysis of Errors

Validation metrics are computed across all experiments and reported in Table 3 and Table 4 for L1C and L2A data, respectively. The tables present the root mean square error (RMSE), the mean absolute error (MAE), the relative error (RE) and the coefficient of determination (R2). The rows of the tables are sorted by increasing values of RE for L1C and RMSE for L2A.
In Section 1.2, we concluded that 3% error for L1C and 0.01 in surface reflectance values for L2A were good targets for band reconstruction. Of course, we are measuring reconstruction errors using data which itself may have errors, even if they are below the radiometric specifications. Therefore, the error bounds need not to be taken very strictly. Finally, Sentinel-2 can be considered to be over-specified in terms of radiometric quality for most applications, which makes using these error bands rather conservative from our point of view.
We see that for L1C only the reconstruction of B7 has an RE lower than 3%, although the other red edge and NIR bands are below 3.8%. For L2A, B5, B6 and B7 have an RMSE lower than 0.01, and B8A is only slightly above this level.
Estimating the noise in surface reflectances using the RMSE can suffer from strong outliers. The MAE gives a measure that is robust to these cases and shows that even B12 could be considered for reconstruction.
The error values presented in Table 3 and Table 4 are averages over the validation samples and do not show the proportion of pixels that do not fulfill the radiometric requirements. For this purpose, Table 5 and Table 6 show the percentage of pixels whose error is lower than a given threshold.
Table 5 presents, for each L2A band, the percentage of pixels whose error is larger than a given threshold (from 0.01, which is the accuracy of the L2A processor, up to 0.025). We see that even for the best-predicted bands (in the red edge), less than 90% of the pixels fulfill the requirements. However, lowering the requirement accuracy to 0.015, 95% compliance is achieved for these three bands.
Table 6 shows the same results for L1C data. The performance seem to be much better than for L2A, but we must remember that the requirements for L1C are given as relative errors (the error must not exceed 3%).
Table 7 shows the percentage of validation pixels compliant with different error thresholds. We see that the requirement has to be lowered from 3% to 10% in order to get 95% compliance for the red edge and NIR bands. This poor performance is mainly due to high relative errors in the low reflectances. Table 8, Table 9, Table 10, Table 11 and Table 12 show the compliance with relative error thresholds for different intervals of reflectances. The results confirm that reflectances lower than 0.1 contain most of the errors.
Figure 7 and Figure 8 display scatterplots of predicted versus real reflectance values for the L1C and L2A bands, respectively. For clarity in the visualization, these scatterplots are generated with a small random sample of the validation data. Nevertheless, they show the general behavior and are coherent with the metrics presented in the tables above.
To complete the analysis of the errors, we present the histograms of the errors (true minus predicted reflectance) using the complete validation dataset (about 5 million pixels). Figure 9 shows the histograms for the L1C bands and Figure 10 for the L2A bands.

3.2.2. Analysis of the Uncertainty Estimation

As explained in Section 2.2.1, the regression model is also able to estimate the uncertainty of the predicted value by associating a variance with it. Since this variance is an estimation itself, its meaningfulness needs to be assessed.
The loss function used to train the model was chosen assuming a Gaussian error model. The histograms in Figure 9 and Figure 10 show that the distributions of the errors are not Gaussian. However, these distributions are mono-modal, which may allow use of the estimated variance as a good proxy for the uncertainty of the estimation. In order to check this hypothesis, we will measure the proportion of pixels having errors higher than a given proportion of the variance.
In the case of a Gaussian distribution, we have P ( μ 1 σ X μ + 1 σ ) 68.27 % , P ( μ 2 σ X μ + 2 σ ) 95.45 % and P ( μ 3 σ X μ + 3 σ ) 99.73 % .
We can therefore compute the proportion of pixels having an absolute error lower than σ , 2 σ and 3 σ and compare the results to the probability values above.
Table 13 and Table 14 present the above-mentioned proportions of pixels whose errors are within the bounds given by the estimated sigma. We see that, although not identical, the proportions are relatively similar to what one should get in the Gaussian case.
It is important to understand that the value of σ is provided by the regression algorithm as a prediction. These results show that this prediction of σ is indeed a good proxy for the probability of the reflectance estimation being in the predicted interval. Therefore, the estimation of σ can be a threshold and used as a validity mask for the estimations.

3.3. Double-Band Regression

We present here the results for the case where two bands are predicted from the others. This case will, of course, produce higher estimation errors because for each predicted band there is one fewer predictor.
Table 15 and Table 16 present the results for the L1C and the L2A data. Each row in the tables presents the results for a pair of bands jointly predicted. The same quality metrics as for single-band regression are presented. Each table has 15 rows since we evaluate all possible combinations of pairs of bands.
The rows in Table 15 are sorted in increasing order of the maximum relative error of the pair of bands. This allows one to quickly see that only one pair of L1C bands (B06 and B07) can be predicted with less than 3% error, and that another pair (B07 and B8A) is slightly above this threshold.
For the L2A data presented in Table 16, the rows are sorted by increasing RMSE using the maximum of the pair in each row. In this case, only the pair (B5, B8A) fulfills the 0.01 error threshold, although the pair (B5, B6) is not much above this threshold.
Figure 11 presents the scatterplots for the two best pairs of L1C bands (the two first rows in Table 15. Altough the scatterplots are generated by subsampling the test dataset for readability, one can see that the estimations are unbiased and have a small dispersion around the regression lines. One can also see that part of the error comes from pixels with reflectances higher than one, for which there is underestimation. Since the regression algorithm is configured to yield reflectances in the [−0.2, 1.3] interval, we can expect that the error in this interval is smaller than what is reported in the tables.
For L2A data, Figure 12 presents the scatterplots for the pairs of bands in the three first rows of Table 16. As with the L1C case, the scatterplots show unbiased estimations with small dispersions, except for the B12 band in the third pair. The random sample of the test set used for generating these scatterplots does not contain pixels showing the underestimation of reflectances higher than one, but they also exist.
It is difficult to explain these results. First of all, the pairs of bands that are predicted the best differ between L1C and L2A. This was already the case for the regression of a single band, but in that case we could clearly define two groups: red edge–NIR and SWIR. In the case of two bands, one could have expected that for a pair of bands to be correctly reconstructed, they would have to be nonadjacent so that the missing information could be reconstructed using the neighboring bands. However, we see that the best pair in L1C is (B06, B07) and that the second best pair in L2C is (B5, B6).
With the same kind of reasoning, one could have expected that the pair (B11, B12) should be the one with the largest errors, since reconstructing the SWIR bands using only the VIS–NIR range should be nearly impossible. This is the case in terms of relative error, but not in terms of RMSE, which makes SWIR a better candidate for L2A reconstruction than more spectrally distant pairs.
Figure 13 presents the scatterplots for the prediction of the SWIR bands in L1C (top row) and L2A (bottom row). Although the dispersions are important, there is no systematic bias in the estimation, which confirms the redundancy of spectral information for most surfaces.
Actually, if the scatterplots of Figure 13 were obtained for biophysical parameter estimations, for instance LAI, chlorophyll, biomass, etc., they would be considered very good (see, for instance [30] or [18]). Of course, image quality criteria need to be more strict than those of downstream tasks, but this kind of result suggests that the impact of reflectance noise in downstream applications needs to be assessed.

4. Discussion

From the results presented in Section 3, we can identify several limitations of this work.
First, analysis of the errors based on type of surface (material, land cover, vegetation status, etc.) should be carried out in order to assess the impact on different applications. Although the spatial sampling of the data for this study contained enough variability for the results to be general, particular surfaces with specificities may need special attention. Furthermore, selecting the appropriate samples in the areas of most interest for particular applications can allow fine-tuning of the regression algorithm and improve the performance of the estimations.
A second limitation is related to the choice of regression algorithm for the study. The goal of the work was not to propose an optimal regression algorithm, but rather to show that band reconstruction is possible using regression. The choice of the neural network with a negative log-likelihood as a loss function was made for simplicity of implementation, the possibility of performing multi-target regression, and the generation of uncertainties associated with the estimations. Other approaches could yield better results and even produce a different set of bands candidate for removal.
All of the above suggests that replication of the study by other teams would be useful. For this purpose, the dataset has been published [25], and the source code is available for inspection and download (https://src.koda.cnrs.fr/mmdc/mmdc-legacy/-/blob/master/mmdc/spectral_regression.py, accessed on 10 May 2022).
A third limitation is the pixel-based approach taken here. Reconstructing a missing band from the reflectances of the other bands of the same pixel assumes unicity of the solution: one combination of observed bands can only correspond to one value of the missing band. Although the results of this study tend to show that this is the case, there are pixels for which the error is high. In the current setting, the regression algorithm is able to flag these pixels by reporting high uncertainty, but this is not fully satisfactory. One way of lifting the ambiguity would be to add some spatial context for the regression, so that the observations of neighboring pixels, and therefore the local texture, helps the prediction. This could be implemented with spatial convolutional layers in the regression algorithm.
In the same way, a multi-temporal extension of the algorithm could improve the estimations. However, this extension is less straightforward than adding spatial context, since clouds and cloud shadows introduce irregular temporal sampling that should be taken into account. Further, the relative geometric accuracy of multi-temporal series should be taken into account in this case.

5. Conclusions

In this paper, we have investigated the possibility of reconstructing one or two of the 20 m resolution bands from Sentinel-2 using the remaining bands. The goal of the study was to assess the possibility of removing some of the current Sentinel-2 spectral bands for the next generations of similar satellites.
The interest of working on band reconstruction is that the approach is independent of the application. The main rationale is, if a band can be reconstructed with errors which are within the radiometric requirements of the sensor, downstream applications can use a reconstructed band instead of a real measure.
The main findings of the study are that, at the least, one of the bands among B5, B6, B7 and B8A could be removed from next generation sensors, as all of them can be reconstructed with small errors when the others are available. Removing two bands could be possible at the cost of slightly higher reconstruction errors.
We have also shown that the estimation of a credibility interval for the predicted reflectances is possible and can therefore be used as a quality mask.
However, this study has several limitations that have been clearly identified in Section 4 and that would need to be addressed in the future.
If the next generation of Sentinel-2 had one or several bands removed, one could argue that the approach presented in this work could not be applied, since the regression calibration (i.e., the neural network training) needs the target band. Several responses can be given to this argument. If the bands used as predictors remain the same in the new sensor, the regressions calibrated with the current Sentinel-2 data should be applicable.
If the bands used as predictors for this study were not available in the next generation of satellites, one could constitute enough training data by using acquisitions from a hyperspectral mission such as CHIME [31]. The appropriate spectral bands (predictors and targets) could be generated using the relative spectral responses of the next generation of Sentinel-2.
Finally, given the temporal revisit of Sentinel-2, it would be interesting to evaluate the possibility of having different bands in different satellites of the constellation, so that the band predictions could be temporally interleaved. For instance, with two satellites, one could imagine removing B6 in the A unit and removing B7 in the B unit. In this configuration, the reconstruction of B6 at a given date could use the other bands for this acquisition, as well as the most recent acquisition of the other satellite, for which B6 would have been observed. This kind of scenario would allow for interesting configurations where one satellite of the pair could have the SWIR bands absent. Indeed, the variance observed in Figure 13 could be highly reduced if the two SWIR bands for a previous date were available. Of course, for surfaces where the SWIR signature can change quickly during cloudy periods (snow falls), the impact of this kind of setting should be studied. Fortunately, available Sentinel-2 archive data allows that.
Another interesting possibility of the approaches presented in this paper is the addition of new bands, but only in some satellites of the constellation. On this topic, we should stress the comments on [32], as we did in Section 1.4: the fact that a particular phenomenon has a signature in a particular band does not mean that this same phenomenon cannot be detected by using a (nonlinear) combination of other bands. The results presented in this paper indicate that the question can be reversed.
The attentive reader will have understood that many options are open to reduce costs and hardware complexity for the successors of the current Sentinel-2 system by leveraging spectral, spatial and temporal correlations of the observed surfaces through ground data processing.
This work is just an example of what could be done by using the richness of the Sentinel-2 archives. We think that with the help of other scientists, further studies could be defined. For instance, a subset of geographic areas and dates for each target application together with ground measures could be made available. This would allow the objective assessment of errors due to the lack of particular bands.

Author Contributions

Conceptualization, J.I. and O.H.; methodology, J.I.; software, J.I. and J.M.; validation, J.I.; formal analysis, J.I. and O.H.; investigation, J.I.; resources, J.I., J.M. and O.H.; data curation, J.I.; writing—original draft preparation, J.I.; writing—review and editing, J.I., J.M. and O.H.; visualization, J.I.; supervision, J.I.; project administration, J.I.; funding acquisition, J.I. and O.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The dataset has been published [25], and the source code is available for inspection and download at https://src.koda.cnrs.fr/mmdc/mmdc-legacy/-/blob/master/mmdc/spectral_regression.py, accessed on 10 May 2022.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Drusch, M.; Bello, U.D.; Carlier, S.; Colin, O.; Fernandez, V.; Gascon, F.; Hoersch, B.; Isola, C.; Laberinti, P.; Martimort, P.; et al. Sentinel-2: ESA’s Optical High-Resolution Mission for GMES Operational Services. Remote Sens. Environ. 2012, 120, 25–36. [Google Scholar] [CrossRef]
  2. Team, E.S. GMES Sentinel-2 Mission Requirements Document; Technical Report; European Space Agency: Paris, France, 2007. [Google Scholar]
  3. Hagolle, O.; Huc, M.; Pascual, D.V.; Dedieu, G. A Multi-Temporal Method for Cloud Detection, Applied To Formosat-2, VENμS, Landsat and Sentinel-2 Images. Remote Sens. Environ. 2010, 114, 1747–1755. [Google Scholar] [CrossRef] [Green Version]
  4. Hagolle, O.; Huc, M.; Villa Pascual, D.; Dedieu, G. A Multi-Temporal and Multi-Spectral Method To Estimate Aerosol Optical Thickness Over Land, for the Atmospheric Correction of Formosat-2, Landsat, VENμS and Sentinel-2 Images. Remote Sens. 2015, 7, 2668–2691. [Google Scholar] [CrossRef] [Green Version]
  5. Gorroño, J.; Fomferra, N.; Peters, M.; Gascon, F.; Underwood, C.; Fox, N.; Kirches, G.; Brockmann, C. A Radiometric Uncertainty Tool for the Sentinel 2 Mission. Remote Sens. 2017, 9, 178. [Google Scholar] [CrossRef] [Green Version]
  6. Roujean, J.L.; Leroy, M.; Deschamps, P.Y. A Bidirectional Reflectance Model of the Earth’s Surface for the Correction of Remote Sensing Data. J. Geophys. Res. Atmos. 1992, 97, 20455–20468. [Google Scholar] [CrossRef]
  7. Fernández-Manso, A.; Fernández-Manso, O.; Quintano, C. Sentinel-2a Red-Edge Spectral Indices Suitability for Discriminating Burn Severity. Int. J. Appl. Earth Obs. Geoinf. 2016, 50, 170–175. [Google Scholar] [CrossRef]
  8. Pasqualotto, N.; Delegido, J.; Van Wittenberghe, S.; Rinaldi, M.; Moreno, J. Multi-Crop Green Lai Estimation With a New Simple Sentinel-2 Lai Index (SeLI). Sensors 2019, 19, 904. [Google Scholar] [CrossRef] [Green Version]
  9. Tian, J.; Su, S.; Tian, Q.; Zhan, W.; Xi, Y.; Wang, N. A Novel Spectral Index for Estimating Fractional Cover of Non-Photosynthetic Vegetation Using Near-Infrared Bands of Sentinel Satellite. Int. J. Appl. Earth Obs. Geoinf. 2021, 101, 102361. [Google Scholar] [CrossRef]
  10. Jacques, D.C.; Kergoat, L.; Hiernaux, P.; Mougin, E.; Defourny, P. Monitoring Dry Vegetation Masses in Semi-Arid Areas with Modis Swir Bands. Remote Sens. Environ. 2014, 153, 40–49. [Google Scholar] [CrossRef]
  11. Gao, B.C. NDWI—A normalized difference water index for remote sensing of vegetation liquid water from space. Remote Sens. Environ. 1996, 58, 257–266. [Google Scholar] [CrossRef]
  12. Misra, G.; Cawkwell, F.; Wingler, A. Status of Phenological Research Using Sentinel-2 Data: A Review. Remote Sens. 2020, 12, 2760. [Google Scholar] [CrossRef]
  13. Sitokonstantinou, V.; Papoutsis, I.; Kontoes, C.; Arnal, A.; Andrés, A.P.; Zurbano, J.A. Scalable Parcel-Based Crop Identification Scheme Using Sentinel-2 Data Time-Series for the Monitoring of the Common Agricultural Policy. Remote Sens. 2018, 10, 911. [Google Scholar] [CrossRef] [Green Version]
  14. Merzlyak, M.N.; Gitelson, A.A.; Chivkunova, O.B.; Rakitin, V.Y. Non-Destructive Optical Detection of Pigment Changes During Leaf Senescence and Fruit Ripening. Physiol. Plant. 1999, 106, 135–141. [Google Scholar] [CrossRef] [Green Version]
  15. McFeeters, S. The Use of the Normalized Difference Water Index (ndwi) in the Delineation of Open Water Features. Int. J. Remote Sens. 1996, 17, 1425–1432. [Google Scholar] [CrossRef]
  16. Cerasoli, S.; Campagnolo, M.; Faria, J.; Nogueira, C.; Caldeira, M.d.C. On Estimating the Gross Primary Productivity of Mediterranean Grasslands Under Different Fertilization Regimes Using Vegetation Indices and Hyperspectral Reflectance. Biogeosciences 2018, 15, 5455–5471. [Google Scholar] [CrossRef] [Green Version]
  17. Gómez-Giráldez, P.J.; Pérez-Palazón, M.J.; Polo, M.J.; González-Dugo, M.P. Monitoring Grass Phenology and Hydrological Dynamics of an Oak-Grass Savanna Ecosystem Using Sentinel-2 and Terrestrial Photography. Remote Sens. 2020, 12, 600. [Google Scholar] [CrossRef] [Green Version]
  18. Clevers, J.; Kooistra, L.; van den Brande, M. Using Sentinel-2 Data for Retrieving Lai and Leaf and Canopy Chlorophyll Content of a Potato Crop. Remote Sens. 2017, 9, 405. [Google Scholar] [CrossRef] [Green Version]
  19. Clevers, J.; Gitelson, A. Remote Estimation of Crop and Grass Chlorophyll and Nitrogen Content Using Red-Edge Bands on Sentinel-2 and -3. Int. J. Appl. Earth Obs. Geoinf. 2013, 23, 344–351. [Google Scholar] [CrossRef]
  20. Hively, W.D.; Lamb, B.T.; Daughtry, C.S.; Serbin, G.; Dennison, P.; Kokaly, R.F.; Wu, Z.; Masek, J.G. Evaluation of Swir Crop Residue Bands for the Landsat Next Mission. Remote Sens. 2021, 13, 3718. [Google Scholar] [CrossRef]
  21. Transon, J.; d’Andrimont, R.; Maugnard, A.; Defourny, P. Survey of Hyperspectral Earth Observation Applications From Space in the Sentinel-2 Context. Remote Sens. 2018, 10, 157. [Google Scholar] [CrossRef] [Green Version]
  22. van der Meer, F.; van der Werff, H.; van Ruitenbeek, F. Potential of Esa’s Sentinel-2 for Geological Applications. Remote Sens. Environ. 2014, 148, 124–133. [Google Scholar] [CrossRef]
  23. Van der Werff, H.; van der Meer, F. Sentinel-2 for Mapping Iron Absorption Feature Parameters. Remote Sens. 2015, 7, 12635–12653. [Google Scholar] [CrossRef] [Green Version]
  24. Roberts, D.; Wilford, J.; Ghattas, O. Exposed Soil and Mineral Map of the Australian Continent Revealing the Land At Its Barest. Nat. Commun. 2019, 10, 5297. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  25. Inglada, J. Sentinel-2 L1C and L2A Pixel Samples for Band Regression; Zendo: Geneva, Switzerland, 2021. [Google Scholar] [CrossRef]
  26. Nix, D.; Weigend, A. Estimating the Mean and Variance of the Target Probability Distribution. In Proceedings of the 1994 IEEE International Conference on Neural Networks (ICNN’94), Orlando, FL, USA, 28 June–2 July 1994. [Google Scholar] [CrossRef]
  27. LeCun, Y.; Bengio, Y.; Hinton, G. Deep Learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
  28. Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention is All you Need. In Advances in Neural Information Processing Systems 30; Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R., Eds.; Curran Associates, Inc.: Sussex, NB, Canada, 2017; pp. 5998–6008. [Google Scholar]
  29. Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
  30. Camacho, F.; Fuster, B.; Li, W.; Weiss, M.; Ganguly, S.; Lacaze, R.; Baret, F. Crop Specific Algorithms Trained Over Ground Measurements Provide the Best Performance for Gai and Fapar Estimates From Landsat-8 Observations. Remote Sens. Environ. 2021, 260, 112453. [Google Scholar] [CrossRef]
  31. Earth and Mission Science Division. Copernicus Hyperspectral Imaging Mission for the Environment—Mission Requirements Document; Technical Report; European Space Agency: Paris, France, 2019. [Google Scholar]
  32. Hively, W.D.; Lamb, B.T.; Dennison, P.; Serbin, G. Reflectance Spectra of Agricultural Field Conditions Supporting Remote Sensing Evaluation of Non-Photosynthetic Vegetation Cover; U.S. Geological Survey: Reston, VA, USA, 2021. [CrossRef]
Figure 1. Sentinel-2A relative spectral responses from https://sentinels.copernicus.eu/documents/247904/685211/S2-SRF_COPE-GSEG-EOPG-TN-15-0007_3.0.xlsx, accessed on 10 May 2022.
Figure 1. Sentinel-2A relative spectral responses from https://sentinels.copernicus.eu/documents/247904/685211/S2-SRF_COPE-GSEG-EOPG-TN-15-0007_3.0.xlsx, accessed on 10 May 2022.
Remotesensing 14 02503 g001
Figure 2. Geographic distribution of the tiles used for the study.
Figure 2. Geographic distribution of the tiles used for the study.
Remotesensing 14 02503 g002
Figure 3. A multi-layer perceptron with one input layer, one hidden layer and one output layer. Diagram adapted from https://github.com/PetarV-/TikZ, accessed on 10 May 2022.
Figure 3. A multi-layer perceptron with one input layer, one hidden layer and one output layer. Diagram adapted from https://github.com/PetarV-/TikZ, accessed on 10 May 2022.
Remotesensing 14 02503 g003
Figure 4. Overview of the nonlinear regression of a set of spectral bands using other bands and angular information as predictors assuming a Gaussian error model.
Figure 4. Overview of the nonlinear regression of a set of spectral bands using other bands and angular information as predictors assuming a Gaussian error model.
Remotesensing 14 02503 g004
Figure 5. Normalized mutual information.
Figure 5. Normalized mutual information.
Remotesensing 14 02503 g005
Figure 6. Red edge and NIR bands.
Figure 6. Red edge and NIR bands.
Remotesensing 14 02503 g006
Figure 7. Scatterplots for the single-band regression (L1C). The colors Remotesensing 14 02503 i001 indicate the density of points.
Figure 7. Scatterplots for the single-band regression (L1C). The colors Remotesensing 14 02503 i001 indicate the density of points.
Remotesensing 14 02503 g007
Figure 8. Scatterplots for the single-band regression (L2A). The colors Remotesensing 14 02503 i001 indicate the increasing density of points.
Figure 8. Scatterplots for the single-band regression (L2A). The colors Remotesensing 14 02503 i001 indicate the increasing density of points.
Remotesensing 14 02503 g008
Figure 9. Histograms of the errors (true value minus prediction) for the L1C bands.
Figure 9. Histograms of the errors (true value minus prediction) for the L1C bands.
Remotesensing 14 02503 g009
Figure 10. Histograms of the errors (true value minus prediction) for the L2A bands.
Figure 10. Histograms of the errors (true value minus prediction) for the L2A bands.
Remotesensing 14 02503 g010
Figure 11. Scatterplots for the double-band regression (L1C). Each row in the figure corresponds to a row in Table 15. The colors Remotesensing 14 02503 i001 indicate the increasing density of points.
Figure 11. Scatterplots for the double-band regression (L1C). Each row in the figure corresponds to a row in Table 15. The colors Remotesensing 14 02503 i001 indicate the increasing density of points.
Remotesensing 14 02503 g011
Figure 12. Scatterplots for the double-band regression (L2A). Each row in the figure corresponds to a row in Table 16. The colors Remotesensing 14 02503 i001 indicate the increasing density of points.
Figure 12. Scatterplots for the double-band regression (L2A). Each row in the figure corresponds to a row in Table 16. The colors Remotesensing 14 02503 i001 indicate the increasing density of points.
Remotesensing 14 02503 g012
Figure 13. Scatterplots for the double-band regression of the SWIR bands in L1C (top) and L2A (bottom). The colors Remotesensing 14 02503 i001 indicate the increasing density of points.
Figure 13. Scatterplots for the double-band regression of the SWIR bands in L1C (top) and L2A (bottom). The colors Remotesensing 14 02503 i001 indicate the increasing density of points.
Remotesensing 14 02503 g013
Table 1. Name and central wavelength of the Sentinel-2 spectral bands [1].
Table 1. Name and central wavelength of the Sentinel-2 spectral bands [1].
BandCentral Wavelength (nm)Spatial Resolution (m)
1—Coastal aerosol442.760
2—Blue492.410
3—Green559.810
4—Red664.610
5—Vegetation red edge704.120
6—Vegetation red edge740.520
7—Vegetation red edge782.820
8—NIR832.810
8A—Narrow NIR864.720
9—Water vapour945.160
10—SWIR—Cirrus1373.560
11—SWIR1613.720
12—SWIR2202.420
Table 2. Spectral indices leveraging Sentinel-2 spectral bands for applications related to vegetation and water surfaces.
Table 2. Spectral indices leveraging Sentinel-2 spectral bands for applications related to vegetation and water surfaces.
IndexFormulaApplicationReference
C I r e d e d g e B 7 B 5 1 Chlorophyll, burnt areas [7]
C I g r e e n B 7 B 3 1 ""
R E P 705 + 35 ( B 4 + B 7 ) 2 B 5 B 6 B 5 ""
M T C I B 6 B 5 B 5 B 4 ""
N D R E 1 B 6 B 5 B 6 + B 5 ""
N D R E 2 B 7 B 5 B 7 + B 5 ""
T R B I B 12 + B 6 B 8 A LAI estimation [8]
N S S I B 8 A B 7 B 8 A + B 7 Non-photosynthetic vegetation [9]
P S R I B 4 B 2 B 6 Senescent vegetation [14]
S T I B 11 B 12 Tillage, dry vegetation [10]
N D W I B 3 B 8 A B 3 + B 8 A Water bodies [15]
N D W I B 8 B 11 B 8 + B 11 " [11]
N D W I B 8 B 12 B 8 + B 12 ""
Table 3. Single-band regression results for L1C. The colors in the RE (relative error) column indicate whether the specification is fulfilled (light gray), nearly fulfilled (middle gray) or unfulfilled (dark gray).
Table 3. Single-band regression results for L1C. The colors in the RE (relative error) column indicate whether the specification is fulfilled (light gray), nearly fulfilled (middle gray) or unfulfilled (dark gray).
BandRMSEMAERE R 2
B077.17 × 10 3 3.90 × 10 3 2.96 × 10 2 9.96 × 10 1
B061.82 × 10 2 4.77 × 10 3 3.61 × 10 2 9.88 × 10 1
B8A1.57 × 10 2 5.33 × 10 3 3.69 × 10 2 9.91 × 10 1
B051.57 × 10 2 4.46 × 10 3 3.79 × 10 2 9.92 × 10 1
B121.50 × 10 2 9.18 × 10 3 9.35 × 10 2 9.83 × 10 1
B111.83 × 10 2 1.26 × 10 2 1.51 × 10 1 9.85 × 10 1
Table 4. Single-band regression results for L2A. The colors in the RMSE (relative error) column indicate whether the specification is fulfilled (light gray), nearly fulfilled (middle gray) or unfulfilled (dark gray).
Table 4. Single-band regression results for L2A. The colors in the RMSE (relative error) column indicate whether the specification is fulfilled (light gray), nearly fulfilled (middle gray) or unfulfilled (dark gray).
BandRMSEMAERE R 2
B57.33 × 10 3 4.96 × 10 3 2.07 × 10 1 9.95 × 10 1
B68.26 × 10 3 5.04 × 10 3 1.28 × 10 1 9.96 × 10 1
B78.42 × 10 3 5.02 × 10 3 1.18 × 10 1 9.97 × 10 1
B8A1.11 × 10 2 6.14 × 10 3 2.23 × 10 1 9.95 × 10 1
B121.49 × 10 2 9.49 × 10 3 2.41 × 10 1 9.75 × 10 1
B112.06 × 10 2 1.36 × 10 2 4.05 × 10 1 9.81 × 10 1
Table 5. Percentage of pixels beyond a given absolute error threshold (L2A).
Table 5. Percentage of pixels beyond a given absolute error threshold (L2A).
Band0.010.0150.020.025
94.761.940.91
B612.605.352.671.51
B712.475.312.601.39
B8A19.198.734.402.47
B1146.7534.4525.0318.11
B1233.8122.0014.9610.41
Table 6. Percentage of pixels beyond a given absolute error threshold (L1C).
Table 6. Percentage of pixels beyond a given absolute error threshold (L1C).
Band0.010.0150.020.025
B056.862.371.210.78
B068.093.071.420.79
B078.953.151.190.51
B8A13.494.831.870.82
B1143.7030.5120.9714.35
B1230.2020.3814.149.75
Table 7. Percentage of pixels beyond a given relative error threshold (L1C).
Table 7. Percentage of pixels beyond a given relative error threshold (L1C).
Band0.030.050.1
B0537.2320.445.34
B0626.5010.742.36
B0721.998.972.10
B8A30.1113.163.66
B1169.3351.5322.93
B1273.7958.2530.10
Table 8. Percentage of pixels beyond a given relative error threshold for reflectances in [0, 0.1] (L1C).
Table 8. Percentage of pixels beyond a given relative error threshold for reflectances in [0, 0.1] (L1C).
Band0.030.050.1
B0549.6629.347.85
B0639.4816.892.25
B0745.1623.584.94
B8A48.4627.907.02
B1175.1660.2532.95
B1274.9260.2633.89
Table 9. Percentage of pixels beyond a given relative error threshold for reflectances in [0.1, 0.25] (L1C).
Table 9. Percentage of pixels beyond a given relative error threshold for reflectances in [0.1, 0.25] (L1C).
Band0.030.050.1
B0535.0117.533.58
B0625.188.931.04
B0721.638.111.05
B8A30.7012.171.69
B1169.4051.4820.39
B1273.1457.2227.93
Table 10. Percentage of pixels beyond a given relative error threshold for reflectances in [0.25, 0.5] (L1C).
Table 10. Percentage of pixels beyond a given relative error threshold for reflectances in [0.25, 0.5] (L1C).
Band0.030.050.1
B057.982.690.37
B0616.715.980.42
B0714.574.030.26
B8A19.824.630.24
B1165.3145.4315.98
B1270.9452.6619.58
Table 11. Percentage of pixels beyond a given relative error threshold for reflectances in [0.5, 0.75] (L1C).
Table 11. Percentage of pixels beyond a given relative error threshold for reflectances in [0.5, 0.75] (L1C).
Band0.030.050.1
B0510.063.150.29
B0610.202.930.22
B0717.324.881.13
B8A20.405.260.25
B1153.2427.823.57
B1252.1031.805.49
Table 12. Percentage of pixels beyond a given relative error threshold for reflectances in [0.75, 1] (L1C).
Table 12. Percentage of pixels beyond a given relative error threshold for reflectances in [0.75, 1] (L1C).
Band0.030.050.1
B0513.948.351.36
B065.301.170.07
B073.871.300.73
B8A6.610.860.02
B1185.9075.6451.28
B1292.8690.0081.43
Table 13. Probability of the absolute error being lower than n × σ (L1C).
Table 13. Probability of the absolute error being lower than n × σ (L1C).
Band σ (68.27%) 2 σ (95.45%) 3 σ (99.73%)
B0568.8092.2698.29
B0670.2393.1498.53
B0770.8293.2798.43
B8A70.3691.9997.57
B1156.8684.1194.94
B1261.5787.5096.23
Table 14. Probability of the absolute error being lower than n × σ (L2A).
Table 14. Probability of the absolute error being lower than n × σ (L2A).
Band σ (68.27%) 2 σ (95.45%) 3 σ (99.73%)
B565.6890.0096.87
B671.8493.8598.70
B774.2894.5098.69
B8A70.0792.1397.93
B1156.1881.5193.04
B1264.7091.1597.91
Table 15. Double-band regression results for L1C. The colors in the RE (relative error) columns indicate whether the specification is fulfilled (light gray), nearly fulfilled (middle gray) or unfulfilled (dark gray).
Table 15. Double-band regression results for L1C. The colors in the RE (relative error) columns indicate whether the specification is fulfilled (light gray), nearly fulfilled (middle gray) or unfulfilled (dark gray).
BandRMSEMAERE R 2 BandRMSEMAERE R 2
B069.28 × 10 3 4.79 × 10 3 2.67 × 10 2 9.93 × 10 1 B071.07 × 10 2 4.93 × 10 3 2.64 × 10 2 9.93 × 10 1
B071.68 × 10 2 5.49 × 10 3 3.47 × 10 2 9.89 × 10 1 B8A1.95 × 10 2 7.84 × 10 3 4.34 × 10 2 9.84 × 10 1
B059.31 × 10 3 3.96 × 10 3 5.12 × 10 2 9.94 × 10 1 B061.01 × 10 2 4.22 × 10 3 4.07 × 10 2 9.94 × 10 1
B056.52 × 10 3 3.58 × 10 3 3.88 × 10 2 9.94 × 10 1 B111.66 × 10 2 1.12 × 10 2 7.57 × 10 2 9.86 × 10 1
B061.52 × 10 2 4.85 × 10 3 3.24 × 10 2 9.92 × 10 1 B111.75 × 10 2 1.16 × 10 2 9.04 × 10 2 9.82 × 10 1
B064.38 × 10 2 1.58 × 10 2 1.00 × 10 1 9.47 × 10 1 B8A1.90 × 10 2 5.88 × 10 3 7.06 × 10 2 9.92 × 10 1
B058.20 × 10 3 4.16 × 10 3 3.64 × 10 2 9.94 × 10 1 B074.34 × 10 2 1.66 × 10 2 1.21 × 10 1 8.41 × 10 1
B053.79 × 10 2 8.14 × 10 3 9.38 × 10 2 8.74 × 10 1 B121.92 × 10 2 1.07 × 10 2 1.27 × 10 1 9.66 × 10 1
B059.64 × 10 3 4.24 × 10 3 3.93 × 10 2 9.96 × 10 1 B8A4.63 × 10 2 1.93 × 10 2 1.31 × 10 1 9.45 × 10 1
B071.74 × 10 2 4.45 × 10 3 4.05 × 10 2 9.88 × 10 1 B115.92 × 10 2 2.88 × 10 2 1.97 × 10 1 8.47 × 10 1
B8A2.66 × 10 2 7.06 × 10 3 5.44 × 10 2 9.73 × 10 1 B121.64 × 10 2 1.00 × 10 2 2.26 × 10 1 9.72 × 10 1
B071.93 × 10 2 5.07 × 10 3 4.74 × 10 2 9.87 × 10 1 B125.17 × 10 2 2.00 × 10 2 2.37 × 10 1 7.91 × 10 1
B067.30 × 10 3 3.88 × 10 3 3.07 × 10 2 9.95 × 10 1 B123.63 × 10 2 1.78 × 10 2 2.52 × 10 1 8.59 × 10 1
B8A5.70 × 10 2 1.47 × 10 2 1.25 × 10 1 8.87 × 10 1 B116.20 × 10 2 2.80 × 10 2 2.79 × 10 1 7.51 × 10 1
B113.73 × 10 2 2.45 × 10 2 2.93 × 10 1 9.18 × 10 1 B123.26 × 10 2 2.03 × 10 2 2.24 × 10 1 9.05 × 10 1
Table 16. Double-band regression results for L2A. The colors in the RE (relative error) columns indicate whether the specification is fulfilled (light gray), nearly fulfilled (middle gray) or unfulfilled (dark gray).
Table 16. Double-band regression results for L2A. The colors in the RE (relative error) columns indicate whether the specification is fulfilled (light gray), nearly fulfilled (middle gray) or unfulfilled (dark gray).
BandRMSEMAERE R 2 BandRMSEMAERE R 2
B57.38 × 10 3 4.96 × 10 3 1.77 × 10 1 9.95 × 10 1 B8A9.31 × 10 3 6.22 × 10 3 1.38 × 10 1 9.96 × 10 1
B51.11 × 10 2 6.07 × 10 3 1.80 × 10 1 9.95 × 10 1 B61.33 × 10 2 6.37 × 10 3 1.25 × 10 1 9.94 × 10 1
B68.68 × 10 3 5.07 × 10 3 1.91 × 10 1 9.96 × 10 1 B121.53 × 10 2 9.59 × 10 3 3.52 × 10 1 9.77 × 10 1
B61.24 × 10 2 6.70 × 10 3 5.37 × 10 1 9.89 × 10 1 B71.54 × 10 2 7.70 × 10 3 1.51 × 10 1 9.85 × 10 1
B51.26 × 10 2 5.35 × 10 3 2.03 × 10 1 9.94 × 10 1 B111.72 × 10 2 1.15 × 10 2 1.94 × 10 1 9.85 × 10 1
B77.30 × 10 3 4.72 × 10 3 1.23 × 10 1 9.96 × 10 1 B121.74 × 10 2 1.11 × 10 2 1.97 × 10 1 9.78 × 10 1
B71.13 × 10 2 5.98 × 10 3 1.10 × 10 1 9.94 × 10 1 B8A1.74 × 10 2 7.70 × 10 3 1.65 × 10 1 9.87 × 10 1
B8A1.21 × 10 2 6.10 × 10 3 1.91 × 10 1 9.95 × 10 1 B121.79 × 10 2 1.08 × 10 2 3.03 × 10 1 9.75 × 10 1
B8A1.07 × 10 2 6.74 × 10 3 1.09 × 10 1 9.94 × 10 1 B111.93 × 10 2 1.31 × 10 2 2.22 × 10 1 9.79 × 10 1
B69.79 × 10 3 5.60 × 10 3 1.47 × 10 1 9.95 × 10 1 B8A1.99 × 10 2 8.58 × 10 3 2.10 × 10 1 9.81 × 10 1
B114.14 × 10 2 2.89 × 10 2 3.29 × 10 1 9.01 × 10 1 B123.54 × 10 2 2.36 × 10 2 4.86 × 10 1 8.90 × 10 1
B73.74 × 10 2 1.41 × 10 2 5.18 × 10 1 9.21 × 10 1 B114.98 × 10 2 2.50 × 10 2 2.37 × 10 1 8.56 × 10 1
B55.80 × 10 2 1.17 × 10 2 2.11 × 10 1 8.73 × 10 1 B121.71 × 10 2 1.09 × 10 2 4.36 × 10 1 9.76 × 10 1
B66.55 × 10 2 1.44 × 10 2 7.58 × 10 1 8.01 × 10 1 B114.06 × 10 2 1.88 × 10 2 3.21 × 10 1 8.75 × 10 1
B57.70 × 10 2 4.45 × 10 2 2.87 × 10 1 8.35 × 10 1 B79.79 × 10 3 5.83 × 10 3 9.24 × 10 2 9.96 × 10 1
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Inglada, J.; Michel, J.; Hagolle, O. Assessment of the Usefulness of Spectral Bands for the Next Generation of Sentinel-2 Satellites by Reconstruction of Missing Bands. Remote Sens. 2022, 14, 2503. https://doi.org/10.3390/rs14102503

AMA Style

Inglada J, Michel J, Hagolle O. Assessment of the Usefulness of Spectral Bands for the Next Generation of Sentinel-2 Satellites by Reconstruction of Missing Bands. Remote Sensing. 2022; 14(10):2503. https://doi.org/10.3390/rs14102503

Chicago/Turabian Style

Inglada, Jordi, Julien Michel, and Olivier Hagolle. 2022. "Assessment of the Usefulness of Spectral Bands for the Next Generation of Sentinel-2 Satellites by Reconstruction of Missing Bands" Remote Sensing 14, no. 10: 2503. https://doi.org/10.3390/rs14102503

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop