# Joint Characterization of Sentinel-2 Reflectance: Insights from Manifold Learning

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

- Geophysical
- What is the overall S,V,D fraction distribution of globally diverse representatives of significant land cover categories?
- How well does the global S,V,D model fit each land cover category, as measured by root mean square misfit?

- Topological
- How clustered or continuous are the manifolds for each land cover category found by UMAP?

- Joint
- To what extent can S,V,D fractions and UMAP clusters be used together to yield useful information? Specifically,
- i.
- To what extent are UMAP clusters geographically contiguous?
- ii.
- To what extent do disparate UMAP clusters at similar S,V,D fraction values represent physically plausible and/or spectroscopically interpretable spectral variability?
- iii.
- Are some S,V,D fractions, or land cover classes, better suited to JC than others? If so, why? If not, why not?

## 2. Materials and Methods

#### 2.1. Data

#### 2.2. Methods

- Use a linear spectral mixture model to characterize the overall S,V,D distribution of each land cover class (variance-based, physical, linear).
- Use Uniform Manifold Approximation and Projection (UMAP; [33]) to characterize interdimensional topology & clustering (topology-based, statistical, nonlinear)
- Synthesize Steps A and B into a set of 1 or more bivariate distributions which use the physical meaning of the Step A fraction distributions to differentiate among purely topological relations identified from Step B (joint characterization).

#### 2.2.1. Step A: Linear Characterization and Modeling: Spectral Mixture Analysis

_{I},V

_{I},D) endmembers [32]. Bivariate fraction distributions (Figure 3, right) show fraction estimates to be within the physical range (0 to 100%) for all tiles except high albedo sands, which give S

_{I}fractions greater than 100% and D fractions less than 0. Mixture model misfit, as quantified by the Root Mean Square Error (RMSE) of misfit between observed and modeled spectra, was less than 6% for over 99% of spectra. Due to the unit sum constraint and the fact that the 3D SVD space maps onto a linear 2D subspace, fraction distributions can also be visualized using a barycentric plot (i.e., ternary diagram) with no loss of information. The remainder of this analysis uses such a visualization to demonstrate variability in S,V,D fraction abundance among land cover classes. For greater detail on variance-based characterization of this mosaic, see [32].

#### 2.2.2. Step B: Nonlinear Characterization and Modeling: Manifold Learning

- -
`n_components`: The number of dimensions of the low-D embedding space.- -
`n_neighbors`: The size of the local neighborhood used when learning the manifold structure of the data.- -
`min_dist`: The limit on how closely points may be spaced in the output space.- -
`metric`: The distance metric in the input space.

- -
- n_components = 2
- -
- n_neighbors = 30
- -
- min_dist = 0.1
- -
- metric = Euclidean

`umap-learn`Python package on a commercially available laptop computer with 32 GB RAM, 2GHz Quad-Core Intel Core i5 CPU, and a 1536 MB Intel Iris Plus Graphics GPU. Runtime for a typical 10 tile (10,000,000 11-band spectra) subset was approximately 2 hours. For more information about UMAP, see [45].

#### 2.2.3. Step C: Joint Characterization: Bivariate Distributions and Cluster Identification

## 3. Results

#### 3.1. Agriculture

#### 3.2. Sands

#### 3.3. Lava and Ash

#### 3.4. Urban

#### 3.5. Forests

#### 3.6. Senescent Vegetation

#### 3.7. Tundra

#### 3.8. Mangroves and Wetlands

#### 3.9. Rocks and Alluvium

## 4. Discussion

#### 4.1. Revisiting the Motivating Questions

#### 4.1.1. Question 1: Variance-Based Characterization & Modeling

#### 4.1.2. Question 2: Topology-Based Characterization & Modeling

#### 4.1.3. Question 3: Leveraging Variance & Topology with Joint Characterization

#### 4.2. Why JC Works: A Convergence of Visions

#### 4.2.1. The Geophysical Vision: Projecting Each Pixel Spectrum Independently onto the Global Mixing Space

_{2}norm. Inverting the linear mixture model to obtain estimates of EM fractions provides a continuous result that is easily validated by comparison with higher spatial resolution imagery (vicarious validation) or in situ field measurements. A key assumption of this approach is that global variance is representative of information content.

#### 4.2.2. The Statistical Vision: Learning High-Dimensional Structure within and among Clusters of Similar Pixel Spectra

#### 4.2.3. Fusing These Two Visions: Joint Characterization

#### 4.3. Limitations and Future Work

#### 4.3.1. Limitations

#### 4.3.2. Future Work

## 5. Conclusions

## Author Contributions

## Funding

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## Appendix A

**Figure A1.**UMAP dependence on n_neighbors parameter. UMAP results for the image mosaic are shown for a range of n_neighbors parameter settings. All runs use min_dist = 0.1 and Euclidean distance metric.

**Figure A2.**UMAP dependence on min_dist parameter. UMAP results for the image mosaic are shown for a range of min_dist parameter settings. All runs use 30 nearest neighbors and Euclidean distance metric.

**Figure A3.**UMAP results using a 3D embedding space. This figure uses min_dist = 0.1, 30 nearest neighbors, and Euclidean distance metric.

**Figure A4.**UMAP dependence on choice of distance metric. Results are shown for the image mosaic. All runs use 30 nearest neighbors and min_dist = 0.1.

**Figure A5.**Comparison to other manifold learning algorithms. Results are shown for Laplacian Eigenmap-based Spectral Embedding (LE), ISOMAP, and t-distributed Stochastic Neighbor Embedding (t-SNE). LE and ISOMAP runs both use the nearest neighbor affinity metric, with n_neighbors = 10. t-SNE uses perplexity = 30 and random initialization.

Agriculture | |||
---|---|---|---|

TileID | UTM Zone | Easting | Northing |

S2A_MSIL1C_20170205T210921_N0204_R057_T04QHH | 4N | 868610 | 2223190 |

S2A_MSIL1C_20170315T101021_N0204_R022_T32TPP | 32N | 623950 | 4864330 |

S2A_MSIL1C_20170508T012701_N0205_R074_T54STE | 54N | 269220 | 3988590 |

S2A_MSIL1C_20170723T064631_N0205_R020_T41TKG | 41N | 266210 | 4645260 |

S2A_MSIL1C_20170917T190351_N0205_R113_T10SFG | 10N | 688930 | 4167330 |

S2A_OPER_PRD_MSIL1C_PDMC_20161017T044357 | 45N | 723470 | 2625060 |

S2B_MSIL1C_20170730T040549_N0205_R047_T47SND | 47N | 554190 | 4363690 |

S2B_MSIL1C_20170918T054629_N0205_R048_T43SDT | 43N | 459570 | 3800040 |

S2B_MSIL1C_20171008T105009_N0205_R051_T30TYN | 30N | 702100 | 4787760 |

S2B_MSIL1C_20171013T081959_N0205_R121_T36SYF | 36N | 778000 | 4095680 |

Sand | |||

TileID | UTM Zone | Easting | Northing |

S2A_MSIL1C_20170628T173901_N0205_R098_T13SCS | 13N | 372290 | 3654900 |

S2A_MSIL1C_20170908T063621_N0205_R120_T40QFK | 40N | 653400 | 2447190 |

S2A_MSIL1C_20171119T040041_N0206_R004_T48TUK | 48N | 305540 | 4438710 |

S2A_MSIL1C_20171208T111441_N0206_R137_T29QKD | 29N | 291550 | 2399280 |

S2A_MSIL1C_20171209T072301_N0206_R006_T38QND | 38N | 527910 | 1890720 |

S2B_MSIL1C_20171207T105419_N0206_R051_T30RVT | 30N | 481880 | 3290910 |

S2B_MSIL1C_20171208T084329_N0206_R064_T33JWN | 33S | 541880 | 7265640 |

S2B_MSIL1C_20171212T100359_N0206_R122_T32RLQ | 32N | 339750 | 2966720 |

S2B_MSIL1C_20171212T100359_N0206_R122_T32RLR | 32N | 331950 | 3100020 |

Lava & Ash | |||

TileID | UTM Zone | Easting | Northing |

S2A_MSIL1C_20170205T210921_N0204_R057_T04QHH | 4N | 861160 | 2206290 |

S2A_MSIL1C_20171016T073911_N0205_R092_T36MZC | 36S | 819250 | 9703580 |

S2A_MSIL1C_20171016T073911_N0205_R092_T36MZC | 36S | 834220 | 9768640 |

S2A_OPER_PRD_MSIL1C_PDMC_20161014T163303 | 15S | 652170 | 9967520 |

S2B_MSIL1C_20170723T124309_N0205_R095_T28WDT | 28N | 399960 | 7200220 |

Urban | |||

TileID | UTM Zone | Easting | Northing |

S2A_MSIL1C_20170508T012701_N0205_R074_T54STE | 54N | 269890 | 3950620 |

S2A_MSIL1C_20170830T131241_N0205_R138_T23KLP | 23S | 328970 | 7398470 |

S2A_MSIL1C_20170916T055631_N0205_R091_T42RUN | 42N | 300000 | 2758120 |

S2A_MSIL1C_20171017T103021_N0205_R108_T32TLQ | 32N | 390060 | 4999690 |

S2B_MSIL1C_20170912T170949_N0205_R112_T14RLP | 14N | 364980 | 2848280 |

Forest—1 | |||

TileID | UTM Zone | Easting | Northing |

S2A_MSIL1C_20170118T081241_N0204_R078_T35MRV | 35S | 831290 | 9963030 |

S2A_MSIL1C_20170119T074231_N0204_R092_T36JTT | 36S | 284150 | 7247210 |

S2A_MSIL1C_20170205T210921_N0204_R057_T04QHH | 4N | 847400 | 2230620 |

S2A_MSIL1C_20170427T021921_N0205_R060_T50HLH | 50S | 355240 | 6230970 |

S2A_MSIL1C_20170508T012701_N0205_R074_T54STE | 54N | 257880 | 3907290 |

S2A_MSIL1C_20170604T043701_N0205_R033_T45RYL | 45N | 794940 | 3088140 |

S2A_MSIL1C_20170705T022551_N0205_R046_T50NMN | 50N | 450950 | 704020 |

S2A_MSIL1C_20170724T145731_N0205_R039_T18LZL | 18S | 875170 | 8546360 |

S2A_MSIL1C_20170724T145731_N0205_R039_T19LBF | 19S | 215640 | 8582190 |

S2A_MSIL1C_20170830T131241_N0205_R138_T23KLP | 23S | 321220 | 7348390 |

Forest—2 | |||

TileID | UTM Zone | Easting | Northing |

S2A_MSIL1C_20170917T190351_N0205_R113_T10SFG | 10N | 607440 | 4106660 |

S2A_OPER_PRD_MSIL1C_PDMC_20151206T145051 | 20N | 469370 | 431170 |

S2B_MSIL1C_20170713T023549_N0205_R089_T51RTN | 51N | 231700 | 3257530 |

S2B_MSIL1C_20170718T101029_N0205_R022_T32TQS | 32N | 773730 | 5121020 |

S2B_MSIL1C_20170906T002659_N0205_R016_T55KCA | 55S | 353630 | 8006280 |

S2B_MSIL1C_20170912T084549_N0205_R107_T36TUL | 36N | 335150 | 4512660 |

S2B_MSIL1C_20171009T003649_N0205_R059_T55MDP | 55S | 469610 | 9317570 |

S2B_MSIL1C_20171013T081959_N0205_R121_T36SYF | 36N | 791100 | 4092030 |

S2B_MSIL1C_20171116T132219_N0206_R038_T23KKP | 23S | 215910 | 7344400 |

S2B_MSIL1C_20171215T152629_N0206_R025_T18NUF | 18N | 381240 | 26200 |

Senescent Vegetation | |||

TileID | UTM Zone | Easting | Northing |

S2A_MSIL1C_20170119T074231_N0204_R092_T36JUT | 36S | 387540 | 7237130 |

S2A_MSIL1C_20170119T074231_N0204_R092_T36JUT | 36S | 381920 | 7259800 |

S2A_MSIL1C_20170119T074231_N0204_R092_T36JUT | 36S | 375110 | 7261040 |

S2A_MSIL1C_20170119T074231_N0204_R092_T36JUT | 36S | 379990 | 7209420 |

S2A_MSIL1C_20170516T154911_N0205_R054_T18TWQ | 18N | 563770 | 4938390 |

Tundra & Wetlands | |||

TileID | UTM Zone | Easting | Northing |

S2A_MSIL1C_20170718T210021_N0205_R100_T08WNB | 8N | 508380 | 7654750 |

S2A_MSIL1C_20170718T210021_N0205_R100_T08WNB | 8N | 540940 | 7608620 |

S2A_OPER_PRD_MSIL1C_PDMC_20160318T145513 | 19S | 495986 | 7997974 |

S2B_MSIL1C_20170916T215519_N0205_R029_T06WVB | 6N | 442210 | 7700040 |

S2B_MSIL1C_20170916T215519_N0205_R029_T06WVB | 6N | 458950 | 7676830 |

Mangroves | |||

TileID | UTM Zone | Easting | Northing |

S2A_MSIL1C_20170427T153621_N0205_R068_T18NTP | 18N | 258620 | 824760 |

S2A_MSIL1C_20170704T013711_N0205_R031_T52MHD | 52S | 814620 | 9839210 |

S2A_MSIL1C_20170705T022551_N0205_R046_T50NMN | 50N | 498390 | 752360 |

S2A_MSIL1C_20170705T022551_N0205_R046_T50NMN | 50N | 423780 | 704730 |

S2A_MSIL1C_20170916T055631_N0205_R091_T42RUN | 42N | 319520 | 2736030 |

S2A_OPER_PRD_MSIL1C_PDMC_20161018T073751 | 38N | 655730 | 3419140 |

S2B_MSIL1C_20170826T155519_N0205_R011_T17NMJ | 17N | 472220 | 875270 |

S2B_MSIL1C_20170919T140039_N0205_R067_T21KVA | 21S | 445610 | 8017250 |

S2B_MSIL1C_20171123T043059_N0206_R133_T45QYE | 45N | 756960 | 2481220 |

S2B_MSIL1C_20171123T043059_N0206_R133_T45QYE | 45N | 763390 | 2429410 |

Rock & Alluvium—1 | |||

TileID | UTM Zone | Easting | Northing |

S2A_MSIL1C_20160723T143750_T19KER | 19S | 506000 | 7534310 |

S2A_MSIL1C_20170124T051101_N0204_R019_T44RQV | 44N | 781870 | 3417600 |

S2A_MSIL1C_20170412T074611_N0204_R135_T37PDQ | 37N | 467190 | 1496550 |

S2A_MSIL1C_20170412T074611_N0204_R135_T37PDQ | 37N | 415880 | 1480390 |

S2A_MSIL1C_20170613T182921_N0205_R027_T11SMB | 11N | 478340 | 4162580 |

S2A_MSIL1C_20170613T182921_N0205_R027_T11SMB | 11N | 441920 | 4110190 |

S2A_MSIL1C_20170613T182921_N0205_R027_T11SMB | 11N | 424630 | 4194020 |

S2A_MSIL1C_20170613T182921_N0205_R027_T11SMB | 11N | 429810 | 4180830 |

S2A_MSIL1C_20170627T180911_N0205_R084_T12SUF | 12N | 310360 | 4011400 |

S2A_MSIL1C_20170627T180911_N0205_R084_T12SUF | 12N | 304930 | 4096250 |

Rock & Alluvium—2 | |||

TileID | UTM Zone | Easting | Northing |

S2A_MSIL1C_20170627T180911_N0205_R084_T12SUG | 12N | 393280 | 4169500 |

S2A_MSIL1C_20170908T063621_N0205_R120_T40QFK | 40N | 664760 | 2494790 |

S2A_MSIL1C_20171201T150711_N0206_R039_T18LZH | 18S | 866060 | 8213050 |

S2A_MSIL1C_20171207T082321_N0206_R121_T34HCH | 34S | 395100 | 6286480 |

S2A_OPER_PRD_MSIL1C_PDMC_20151022T184002 | 11N | 516790 | 4027140 |

S2A_OPER_PRD_MSIL1C_PDMC_20160318T145513 | 19S | 486817 | 8008443 |

S2B_MSIL1C_20171103T061009_N0206_R134_T42SWC | 42N | 576560 | 3774420 |

S2B_MSIL1C_20171103T061009_N0206_R134_T42SWD | 42N | 544220 | 3856340 |

S2B_MSIL1C_20171202T064229_N0206_R120_T40RGU | 40N | 768340 | 3304040 |

S2B_MSIL1C_20171212T064249_N0206_R120_T40QEL | 40N | 520620 | 2570980 |

## References

- Landgrebe, D.; Hoffer, R.; Goodrick, F. An Early Analysis of ERTS-1 Data. 1972. Available online: http://docs.lib.purdue.edu/larstech/106 (accessed on 1 October 2022).
- Straub, C.L.; Koontz, S.R.; Loomis, J.B. Economic Valuation of Landsat Imagery. In Open-File Report; U.S. Geological Survey: Reston, VA, USA, 2019; p. 13. [Google Scholar] [CrossRef]
- Zhu, Z.; Wulder, M.A.; Roy, D.P.; Woodcock, C.E.; Hansen, M.C.; Radeloff, V.C.; Healey, S.P.; Schaaf, C.; Hostert, P.; Strobl, P. Benefits of the Free and Open Landsat Data Policy. Remote Sens. Environ.
**2019**, 224, 382–385. [Google Scholar] [CrossRef] - Landgrebe, D. Machine Processing for Remotely Acquired Data. In LARS Technical Reports; Purdue University Press: West Lafayette, IN, USA, 1973; p. 29. Available online: http://docs.lib.purdue.edu/larstech/29 (accessed on 1 October 2022).
- Price, J.C. Spectral Band Selection for Visible-near Infrared Remote Sensing: Spectral-Spatial Resolution Tradeoffs. IEEE Trans. Geosci. Remote Sens.
**1997**, 35, 1277–1285. [Google Scholar] [CrossRef] - Wulder, M.A.; Roy, D.P.; Radeloff, V.C.; Loveland, T.R.; Anderson, M.C.; Johnson, D.M.; Healey, S.; Zhu, Z.; Scambos, T.A.; Pahlevan, N. Fifty Years of Landsat Science and Impacts. Remote Sens. Environ.
**2022**, 280, 113195. [Google Scholar] [CrossRef] - Drusch, M.; Del Bello, U.; Carlier, S.; Colin, O.; Fernandez, V.; Gascon, F.; Hoersch, B.; Isola, C.; Laberinti, P.; Martimort, P. Sentinel-2: ESA’s Optical High-Resolution Mission for GMES Operational Services. Remote Sens. Environ.
**2012**, 120, 25–36. [Google Scholar] [CrossRef] - Camps-Valls, G. Machine Learning in Remote Sensing Data Processing; IEEE: New York, NY, USA, 2009; pp. 1–6. [Google Scholar]
- Lary, D.J.; Alavi, A.H.; Gandomi, A.H.; Walker, A.L. Machine Learning in Geosciences and Remote Sensing. Geosci. Front.
**2016**, 7, 3–10. [Google Scholar] [CrossRef][Green Version] - Maxwell, A.E.; Warner, T.A.; Fang, F. Implementation of Machine-Learning Classification in Remote Sensing: An Applied Review. Int. J. Remote Sens.
**2018**, 39, 2784–2817. [Google Scholar] [CrossRef][Green Version] - Thompson, D.; Brodrick, P. Making Machine Learning Work for Geoscience: Imaging Spectroscopy as a Case Example. EOS
**2021**. [Google Scholar] [CrossRef] - Roscher, R.; Bohn, B.; Duarte, M.; Garcke, J. Explain It to Me—Facing Remote Sensing Challenges in the Bio-and Geosciences With Explainable Machine Learning. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci.
**2020**, 3, 817–824. [Google Scholar] [CrossRef] - Small, C. Grand Challenges in Remote Sensing Image Analysis and Classification. Front. Remote Sens.
**2021**, 2, 619818. [Google Scholar] [CrossRef] - Cayton, L. Algorithms for Manifold Learning. Univ. Calif. San Diego Tech. Rep.
**2005**, 12, 1–17. [Google Scholar] - Izenman, A.J. Introduction to Manifold Learning. WIREs Comput. Stat.
**2012**, 4, 439–446. [Google Scholar] [CrossRef] - Van Der Maaten, L.; Postma, E.; Van den Herik, J. Dimensionality Reduction: A Comparative Review. J. Mach. Learn Res.
**2009**, 10, 13. [Google Scholar] - Pearson, K. LIII. On Lines and Planes of Closest Fit to Systems of Points in Space. Null
**1901**, 2, 559–572. [Google Scholar] [CrossRef][Green Version] - Small, C. Spatiotemporal Dimensionality and Time-Space Characterization of Multitemporal Imagery. Remote Sens. Environ.
**2012**, 124, 793–809. [Google Scholar] [CrossRef][Green Version] - Woodcock, C.E.; Strahler, A.H. The Factor of Scale in Remote Sensing. Remote Sens. Environ.
**1987**, 21, 311–332. [Google Scholar] [CrossRef] - Adams, J.B.; Smith, M.O.; Johnson, P.E. Spectral Mixture Modeling: A New Analysis of Rock and Soil Types at the Viking Lander 1 Site. J. Geophys. Res. Solid Earth
**1986**, 91, 8098–8112. [Google Scholar] [CrossRef] - Gillespie, A. Interpretation of Residual Images: Spectral Mixture Analysis of AVIRIS Images, Owens Valley, California. In Second Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) Workshop; NASA: Pasadena, CA, USA, 1990; pp. 243–270. [Google Scholar]
- Smith, M.O.; Ustin, S.L.; Adams, J.B.; Gillespie, A.R. Vegetation in Deserts: I. A Regional Measure of Abundance from Multispectral Images. Remote Sens. Environ.
**1990**, 31, 1–26. [Google Scholar] [CrossRef] - Niv, I.; Bregman, Y.; Rabin, N. Identification of Mine Explosions Using Manifold Learning Techniques. IEEE Trans. Geosci. Remote Sens.
**2022**, 60, 1–13. [Google Scholar] [CrossRef] - Li, H.; Cui, J.; Zhang, X.; Han, Y.; Cao, L. Dimensionality Reduction and Classification of Hyperspectral Remote Sensing Image Feature Extraction. Remote Sens.
**2022**, 14, 4579. [Google Scholar] [CrossRef] - Sobien, D.; Higgins, E.; Krometis, J.; Kauffman, J.; Freeman, L. Improving Deep Learning for Maritime Remote Sensing through Data Augmentation and Latent Space. Mach. Learn. Knowl. Extr.
**2022**, 4, 31. [Google Scholar] [CrossRef] - Liu, Y.; Chen, J.; Tan, C.; Zhan, J.; Song, S.; Xu, W.; Yan, J.; Zhang, Y.; Zhao, M.; Wang, Q. Intelligent Scanning for Optimal Rock Discontinuity Sets Considering Multiple Parameters Based on Manifold Learning Combined with UAV Photogrammetry. Eng. Geol.
**2022**, 309, 106851. [Google Scholar] [CrossRef] - Sousa, F.J.; Sousa, D.J. Hyperspectral Reconnaissance: Joint Characterization of the Spectral Mixture Residual Delineates Geologic Unit Boundaries in the White Mountains, CA. Remote Sens.
**2022**, 14, 4914. [Google Scholar] [CrossRef] - Sousa, D.; Small, C. Joint Characterization of Multiscale Information in High Dimensional Data. Adv. Artif. Intell. Mach. Learn.
**2021**, 1, 196–212. [Google Scholar] [CrossRef] - Small, C.; Sousa, D. Joint Characterization of the Cryospheric Spectral Feature Space. Front. Remote Sens.
**2021**, 2. [Google Scholar] [CrossRef] - Sousa, D.; Small, C. Joint Characterization of Spatiotemporal Data Manifolds. Front. Remote Sens.
**2022**, 3, 760650. [Google Scholar] [CrossRef] - Small, C.; Sousa, D. The Climatic Temporal Feature Space: Continuous and Discrete. Adv. Artif. Intell. Mach. Learn.
**2021**, 1, 165–183. [Google Scholar] [CrossRef] - Small, C.; Sousa, D. The Sentinel 2 MSI Spectral Mixing Space. Remote Sens.
**2022**. [Google Scholar] - McInnes, L.; Healy, J.; Melville, J. Umap: Uniform Manifold Approximation and Projection for Dimension Reduction. arXiv
**2018**, arXiv:1802.03426. [Google Scholar] - Mitchell, T.D.; Jones, P.D. An Improved Method of Constructing a Database of Monthly Climate Observations and Associated High-resolution Grids. Int. J. Climatol. J. R. Meteorol. Soc.
**2005**, 25, 693–712. [Google Scholar] [CrossRef] - Houghton, E. Climate Change 1995: The Science of Climate Change: Contribution of Working Group I to the Second Assessment Report of the Intergovernmental Panel on Climate Change; Cambridge University Press: Cambridge, UK, 1996; Volume 2, ISBN 0-521-56436-0. [Google Scholar]
- Small, C. The Landsat ETM+ Spectral Mixing Space. Remote Sens. Environ.
**2004**, 93, 1–17. [Google Scholar] [CrossRef] - Small, C.; Milesi, C. Multi-Scale Standardized Spectral Mixture Models. Remote Sens. Environ.
**2013**, 136, 442–454. [Google Scholar] [CrossRef][Green Version] - Sousa, D.; Small, C. Global Cross-Calibration of Landsat Spectral Mixture Models. Remote Sens. Environ.
**2017**, 192, 139–149. [Google Scholar] [CrossRef][Green Version] - Sousa, D.; Small, C. Globally Standardized MODIS Spectral Mixture Models. Remote Sens. Lett.
**2019**, 10, 1018–1027. [Google Scholar] [CrossRef] - Sousa, D.; Small, C. Multisensor Analysis of Spectral Dimensionality and Soil Diversity in the Great Central Valley of California. Sensors
**2018**, 18, 583. [Google Scholar] [CrossRef][Green Version] - Sousa, D.; Brodrick, P.G.; Cawse-Nicholson, K.; Fisher, J.B.; Pavlick, R.; Small, C.; Thompson, D.R. The Spectral Mixture Residual: A Source of Low-Variance Information to Enhance the Explainability and Accuracy of Surface Biology and Geology Retrievals. J. Geophys. Res. Biogeosci.
**2022**, 127, e2021JG006672. [Google Scholar] [CrossRef] - Settle, J.J.; Drake, N.A. Linear Mixing and the Estimation of Ground Cover Proportions. Int. J. Remote Sens.
**1993**, 14, 1159–1177. [Google Scholar] [CrossRef] - Kauth, R.J.; Thomas, G.S. The Tasselled Cap—A Graphic Description of the Spectral-Temporal Development of Agricultural Crops as Seen by LANDSAT. The Laboratory for Applications of Remote Sensing. In Proceedings of the Symposium on Machine Processing of Remotely Sensed Data, West Lafayette, IN, USA, 29 July 1976; Purdue University: West Lafayette, IN, USA, 1976; Volume 159, pp. 41–51. [Google Scholar]
- Crist, E.P.; Cicone, R.C. A Physically-Based Transformation of Thematic Mapper Data—The TM Tasseled Cap. IEEE Trans. Geosci. Remote Sens.
**1984**, GE-22, 256–263. [Google Scholar] [CrossRef] - McInnes, L. UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction—Umap 0.5 Documentation. Available online: https://umap-learn.readthedocs.io/en/latest/ (accessed on 13 October 2022).
- Boardman, J.W. Automating Spectral Unmixing of AVIRIS Data Using Convex Geometry Concepts; US Gov. Public Use Permitted: Washington, DC, USA, 1993; Volume 1, pp. 11–14. [Google Scholar]
- Boardman, J.W. Leveraging the High Dimensionality of AVIRIS Data for Improved Sub-Pixel Target Unmixing and Rejection of False Positives: Mixture Tuned Matched Filtering; NASA Jet Propulsion Laboratory: Pasadena, CA, USA, 1998; Volume 97, pp. 55–56. [Google Scholar]
- Parker, R. Geophysical Inverse Theory; Princeton University Press: Princeton, NJ, USA, 1994; ISBN 978-0-691-03634-2. [Google Scholar]
- Tarantola, A. Inverse Problem Theory and Methods for Model Parameter Estimation; SIAM: Philadelphia, PA, USA, 2005; ISBN 0-89871-572-5. [Google Scholar]
- Menke, W. Geophysical Data Analysis: Discrete Inverse Theory, 4th ed.; Academic Press: Cambridge, MA, USA, 2018; ISBN 978-0-12-813556-3. [Google Scholar]
- Bachmann, C.M.; Ainsworth, T.L.; Fusina, R.A. Exploiting Manifold Geometry in Hyperspectral Imagery. IEEE Trans. Geosci. Remote Sens.
**2005**, 43, 441–454. [Google Scholar] [CrossRef] - Gillis, D.; Bowles, J.; Lamela, G.M.; Rhea, W.J.; Bachmann, C.M.; Montes, M.; Ainsworth, T. Manifold Learning Techniques for the Analysis of Hyperspectral Ocean Data; International Society for Optics and Photonics: Washington, DC, USA, 2005; Volume 5806, pp. 342–351. [Google Scholar]
- Van der Maaten, L.; Hinton, G. Visualizing Data Using T-SNE. J. Mach. Learn. Res.
**2008**, 9, 2579–2605. [Google Scholar] - Belkin, M.; Niyogi, P. Laplacian Eigenmaps for Dimensionality Reduction and Data Representation. Neural Comput.
**2003**, 15, 1373–1396. [Google Scholar] [CrossRef][Green Version] - Kobak, D.; Linderman, G.C. Initialization Is Critical for Preserving Global Data Structure in Both T-SNE and UMAP. Nat. Biotechnol.
**2021**, 39, 156–157. [Google Scholar] [CrossRef] [PubMed] - Xiang, R.; Wang, W.; Yang, L.; Wang, S.; Xu, C.; Chen, X. A Comparison for Dimensionality Reduction Methods of Single-Cell RNA-Seq Data. Front. Genet.
**2021**, 12, 646936. [Google Scholar] [CrossRef] [PubMed] - Hozumi, Y.; Wang, R.; Yin, C.; Wei, G.-W. UMAP-Assisted K-Means Clustering of Large-Scale SARS-CoV-2 Mutation Datasets. Comput. Biol. Med.
**2021**, 131, 104264. [Google Scholar] [CrossRef] [PubMed] - Jiale, Y.; Ying, Z. Visualization Method of Sound Effect Retrieval Based on UMAP. In Proceedings of the 2020 IEEE 4th Information Technology, Networking, Electronic and Automation Control Conference (ITNEC), Chongqing, China, 12–14 June 2020; Volume 1, pp. 2216–2220. [Google Scholar]
- Small, C. Multiresolution Analysis of Urban Reflectance; IEEE: New York, NY, USA, 2001; pp. 15–19. [Google Scholar]
- Green, R.O.; Mahowald, N.; Ung, C.; Thompson, D.R.; Bator, L.; Bennet, M.; Bernas, M.; Blackway, N.; Bradley, C.; Cha, J.; et al. The Earth Surface Mineral Dust Source Investigation: An Earth Science Imaging Spectroscopy Mission. In Proceedings of the 2020 IEEE Aerospace Conference, Big Sky, MT, USA, 7–14 March 2020; pp. 1–15. [Google Scholar]
- Krutz, D.; Müller, R.; Knodt, U.; Günther, B.; Walter, I.; Sebastian, I.; Säuberlich, T.; Reulke, R.; Carmona, E.; Eckardt, A.; et al. The Instrument Design of the DLR Earth Sensing Imaging Spectrometer (DESIS). Sensors
**2019**, 19, 1622. [Google Scholar] [CrossRef] [PubMed] - Candela, L.; Formaro, R.; Guarini, R.; Loizzo, R.; Longo, F.; Varacalli, G. The PRISMA Mission. In Proceedings of the 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Beijing, China, 10–15 July 2016; pp. 253–256. [Google Scholar]
- Nieke, J.; Rast, M. Towards the Copernicus Hyperspectral Imaging Mission for the Environment (CHIME); IEEE: New York, NY, USA, 2018; pp. 157–159. [Google Scholar]
- Iwasaki, A.; Ohgi, N.; Tanii, J.; Kawashima, T.; Inada, H. Hyperspectral Imager Suite (HISUI)—Japanese Hyper-Multi Spectral Radiometer; IEEE: New York, NY, USA, 2011; pp. 1025–1028. [Google Scholar]
- Thompson, D.R.; Schimel, D.S.; Poulter, B.; Brosnan, I.; Hook, S.J.; Green, R.O.; Glenn, N.; Guild, L.; Henn, C.; Cawse-Nicholson, K. NASA’s Surface Biology and Geology Concept Study: Status and Next Steps; IEEE: New York, NY, USA, 2021; pp. 3269–3271. [Google Scholar]
- Asner, G.P.; Knapp, D.E.; Boardman, J.; Green, R.O.; Kennedy-Bowdoin, T.; Eastwood, M.; Martin, R.E.; Anderson, C.; Field, C.B. Carnegie Airborne Observatory-2: Increasing Science Data Dimensionality via High-Fidelity Multi-Sensor Fusion. Remote Sens. Environ.
**2012**, 124, 454–465. [Google Scholar] [CrossRef] - Boardman, J.W.; Green, R.O. Exploring the Spectral Variability of the Earth as Measured by AVIRIS in 1999. In Proceedings of the Summaries of the 8th Annual JPL Airborne Geoscience Workshop, Pasadena, CA, USA, 1 December 2000; NASA: Pasadena, CA, USA, 2000; Volume 1, pp. 1–12. [Google Scholar]
- Cawse-Nicholson, K.; Hook, S.J.; Miller, C.E.; Thompson, D.R. Intrinsic Dimensionality in Combined Visible to Thermal Infrared Imagery. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens.
**2019**, 12, 4977–4984. [Google Scholar] [CrossRef] - Cawse-Nicholson, K.; Damelin, S.B.; Robin, A.; Sears, M. Determining the Intrinsic Dimension of a Hyperspectral Image Using Random Matrix Theory. IEEE Trans. Image Process.
**2013**, 22, 1301–1310. [Google Scholar] [CrossRef] - Thompson, D.R.; Boardman, J.W.; Eastwood, M.L.; Green, R.O. A Large Airborne Survey of Earth’s Visible-Infrared Spectral Dimensionality. Opt. Express
**2017**, 25, 9186–9195. [Google Scholar] [CrossRef][Green Version] - Tenenbaum, J.B.; Silva, V.D.; Langford, J.C. A Global Geometric Framework for Nonlinear Dimensionality Reduction. Science
**2000**, 290, 2319. [Google Scholar] [CrossRef] - Kruskal, J.B. Multidimensional Scaling by Optimizing Goodness of Fit to a Nonmetric Hypothesis. Psychometrika
**1964**, 29, 1–27. [Google Scholar] [CrossRef] - Kruskal, J.B. Nonmetric Multidimensional Scaling: A Numerical Method. Psychometrika
**1964**, 29, 115–129. [Google Scholar] [CrossRef]

**Figure 1.**Geographic and climatic distributions of 110 Sentinel-2 tiles from spectral diversity hotspots. Geographic distribution of sample sites is guided by climatic and geologic diversity as well as overall species biodiversity (

**a**; top). Individual tile selection criteria favor spectral diversity arising from land cover diversity within and across biomes. Tile geographic coverage corresponds well to global land area distribution within the climatic parameter space (

**b**; lower left) based on 1 degree monthly mean temperature and precipitation (1900–2002) from [34]. All biomes are well represented. Biome classification (

**c**; lower right) adapted from [35].

**Figure 2.**Sentinel-2 composites for land cover subcategories (10 × 10 km) selected from individual hotspot tiles. 1% linear stretch applied.

**Figure 3.**Sentinel-2 SVD spectral mixing space, spectral endmembers, and corresponding SVD fraction space. An eight column (80,000,000 spectra; Columns 2–9 of Figure 2, to emphasize mixing space edges) subset of the Land Cover Subcategory mosaic encompassing the SVD-bounded plane of the full mixing space (

**a**) is effectively 2D with Principal Component (PC) dimensions 1 (81%) and 2 (14%) accounting for 95% of total variance, compared to PC 3 (2%). Maximum amplitude (Outer) and lower amplitude mean (Inner) endmember spectra for Substrate and Vegetation define bases for maximal and minimal SVD models (

**a**; lower right). Inversion of the minimal model provides liberal estimates of SVD fractions (

**b**) but excludes pure sand landscapes. Because sands lie outside the minimal SVD model, their Substrate fractions exceed 1.0 with Dark fractions < 0. A planar SVD fraction distribution can be projected onto a 2D ternary diagram (

**b**; lower right) with no loss of information.

**Figure 4.**Joint characterization of agriculture. 10 × 1 megapixel Sentinel-2 tile subsets are selected from global agricultural hotpots and analyzed at full 10 m pixel resolution. These spectra fill out nearly the entire global SVD mixing space (

**a**) and are well represented by a single global 3-endmember linear mixture model (99% of spectra with <5% RMSE). Manifold learning (

**b**, using UMAP) captures both subtle mixing continua and discrete clusters, but does not offer physical interpretability. Joint characterization (

**e**) uses the physical meaning of the mixture fractions to contextualize the subtle statistical relationships captured by UMAP. Example regions of interest are identified from the joint space. Mean spectra for each region (

**d**) illustrate similarities and differences among statistically distinct clusters. Statistically distinct clusters identified through joint characterization frequently show geographic coherence (

**c**).

**Figure 5.**Joint characterization of sands. 10 × 1 megapixel Sentinel-2 tile subsets are selected from global sand hotspots and analyzed at full 10 m pixel resolution. These spectra preferentially occupy the S apex of the SVD mixing space, with mixing toward D (

**a**), leaving the V portion of the space very sparse. The global 3-endmember linear mixture model fits these spectra better than the agricultural spectra (here, only >99.9% of spectra with <5% RMSE)—but fractions regularly exceed 100%. UMAP (

**b**) captures both subtle mixing continua and discrete clusters, but does not offer physical interpretability. Joint characterization (

**c**) uses the physical meaning of the substrate mixture fraction to contextualize the subtle statistical relationships captured by UMAP. Example regions of interest are identified from the joint space and projected onto the ternary mixing and UMAP spaces. Mean spectra for each region (

**d**) illustrate similarities and differences among statistically distinct clusters. Clusters identified by joint characterization also frequently show geographic coherence (

**e**).

**Figure 6.**Joint characterization of lava and ash. 5 × 1 megapixel Sentinel-2 tile subsets are selected from global volcanic hotspots and analyzed at full 10 m pixel resolution. These spectra preferentially occupy the S to D apexes of the SVD mixing space (

**a**), leaving the V portion of the space relatively sparse. The global 3-endmember linear mixture model fits these spectra less well than the agricultural spectra (here, 96.5% of spectra with <5% RMSE). UMAP (

**b**) captures both subtle mixing continua and discrete clusters, but does not offer physical interpretability. Joint characterization (

**c**) uses the physical meaning of the Substrate mixture fraction to contextualize the subtle statistical relationships captured by UMAP. Example regions of interest are identified from the joint space and projected onto the ternary mixing and UMAP spaces. Mean spectra for each region (

**d**) illustrate similarities and differences among statistically distinct clusters. Clusters identified by joint characterization also frequently show geographic coherence (

**e**).

**Figure 7.**Joint characterization of urban landscapes. 5 × 1 megapixel Sentinel-2 tile subsets are selected from global urban hotspots and analyzed at full 10 m pixel resolution. These spectra fill out most of the SVD mixing space (

**a**). The global 3-endmember linear mixture model fits these spectra less well than the agricultural spectra (here, 97.5% of spectra with <5% RMSE). UMAP (

**b**) captures both subtle mixing continua and discrete clusters, but does not offer physical interpretability. Joint characterization (

**c**) uses the physical meaning of the Substrate mixture fraction to contextualize the subtle statistical relationships captured by UMAP. Example regions of interest are identified from the joint space and projected onto the ternary mixing and UMAP spaces. Mean spectra for each region (

**d**) illustrate similarities and differences among statistically distinct clusters. Clusters identified by joint characterization also frequently show geographic coherence (

**e**).

**Figure 8.**Joint characterization of forests (1). 10 × 1 megapixel Sentinel-2 tile subsets are selected from global forest diversity hotpots and analyzed at full 10 m pixel resolution. These spectra preferentially occupy the V to D apexes of the SVD mixing space (

**a**), leaving the S portion of the space relatively sparse. The global 3-endmember linear mixture model fits these spectra better than the agricultural spectra (here, > 99.9% of spectra with <5% RMSE). UMAP (

**b**) captures both subtle mixing continua and discrete clusters, but does not offer physical interpretability. Joint characterization (

**c**) uses the physical meaning of the Vegetation mixture fraction to contextualize the subtle statistical relationships captured by UMAP. Example regions of interest are identified from the joint space and projected onto the ternary mixing and UMAP spaces. Mean spectra for each region (

**d**) illustrate similarities and differences among statistically distinct clusters. Clusters identified by joint characterization also frequently show geographic coherence (

**e**).

**Figure 9.**Joint characterization of forests (2). 10 × 1 megapixel Sentinel-2 tile subsets are selected from global forest diversity hotpots and analyzed at full 10 m pixel resolution. These spectra preferentially occupy the V to D apexes of the SVD mixing space (

**a**), leaving the S portion of the space relatively sparse. The global 3-endmember linear mixture model fits these spectra better than the agricultural spectra (here, > 99.9% of spectra with <5% RMSE). UMAP (

**b**) captures both subtle mixing continua and discrete clusters, but does not offer physical interpretability. Joint characterization (

**c**) uses the physical meaning of the Vegetation mixture fraction to contextualize the subtle statistical relationships captured by UMAP. Example regions of interest are identified from the joint space and projected onto the ternary mixing and UMAP spaces. Mean spectra for each region (

**d**) illustrate similarities and differences among statistically distinct clusters. Clusters identified by joint characterization also frequently show geographic coherence (

**e**).

**Figure 10.**Joint characterization of senescent vegetation. 5 × 1 megapixel Sentinel-2 tile subsets are selected from global forest diversity hotpots and analyzed at full 10 m pixel resolution. These spectra preferentially occupy the V to D apexes of the SVD mixing space (

**a**), leaving the S portion of the space relatively sparse. The global 3-endmember linear mixture model fits these spectra better than the agricultural spectra (here, 99.9% of spectra with <5% RMSE). UMAP (

**b**) captures both subtle mixing continua and discrete clusters, but does not offer physical interpretability. Joint characterization (

**c**) uses the physical meaning of the Vegetation mixture fraction to contextualize the subtle statistical relationships captured by UMAP. Example regions of interest are identified from the joint space and projected onto the ternary mixing and UMAP spaces. Mean spectra for each region (

**d**) illustrate similarities and differences among statistically distinct clusters. Clusters identified by joint characterization also frequently show geographic coherence (

**e**).

**Figure 11.**Joint characterization of tundra. 5 × 1 megapixel Sentinel-2 tile subsets are selected from global tundra diversity hotpots and analyzed at full 10 m pixel resolution. These spectra preferentially occupy the V to D apexes of the SVD mixing space (

**a**), leaving the S portion of the space relatively sparse. The global 3-endmember linear mixture model fits these spectra better than the agricultural spectra (here, 99.8% of spectra with <5% RMSE). UMAP (

**b**) captures both subtle mixing continua and discrete clusters, but does not offer physical interpretability. Joint characterization (

**c**) uses the physical meaning of the Dark mixture fraction to contextualize the subtle statistical relationships captured by UMAP. Example regions of interest are identified from the joint space and projected onto the ternary mixing and UMAP spaces. Mean spectra for each region (

**d**) illustrate similarities and differences among statistically distinct clusters. Clusters identified by joint characterization also frequently show geographic coherence (

**e**).

**Figure 12.**Joint characterization of mangroves and wetlands. 10 × 1 megapixel Sentinel-2 tile subsets are selected from global forest diversity hotpots and analyzed at full 10 m pixel resolution. These spectra preferentially occupy the V to D apexes of the SVD mixing space (

**a**), leaving the S portion of the space relatively sparse. The global 3-endmember linear mixture model fits these spectra better than the agricultural spectra (here, > 99.9% of spectra with <5% RMSE). UMAP (

**b**) captures both subtle mixing continua and discrete clusters, but does not offer physical interpretability. Joint characterization (

**c**) uses the physical meaning of the Vegetation mixture fraction to contextualize the subtle statistical relationships captured by UMAP. Example regions of interest are identified from the joint space and projected onto the ternary mixing and UMAP spaces. Mean spectra for each region (

**d**) illustrate similarities and differences among statistically distinct clusters. Clusters identified by joint characterization also frequently show geographic coherence (

**e**).

**Figure 13.**Joint characterization of rocks and alluvium (1). 10 × 1 megapixel Sentinel-2 tile subsets are selected from global geology hotspots and analyzed at full 10 m pixel resolution. These spectra preferentially occupy the S to D apexes of the SVD mixing space (

**a**), leaving the V portion of the space relatively sparse. The global 3-endmember linear mixture model fits these spectra less well than the agricultural spectra (here, only 96% of spectra with <5% RMSE). UMAP (

**b**) captures both subtle mixing continua and discrete clusters, but does not offer physical interpretability. Joint characterization (

**c**) uses the physical meaning of the Substrate mixture fraction to contextualize the subtle statistical relationships captured by UMAP. Example regions of interest are identified from the joint space and projected onto the ternary mixing and UMAP spaces. Mean spectra for each region (

**d**) illustrate similarities and differences among statistically distinct clusters. Clusters identified by joint characterization also frequently show geographic coherence (

**e**).

**Figure 14.**Joint characterization of rocks and alluvium (2). 10 × 1 megapixel Sentinel-2 tile subsets are selected from global geology hotspots and analyzed at full 10 m pixel resolution. These spectra preferentially occupy the S to D apexes of the SVD mixing space (

**a**), leaving the V portion of the space relatively sparse. The global 3-endmember linear mixture model fits these spectra less well than the agricultural spectra (here, 98.5% of spectra with <5% RMSE). UMAP (

**b**) captures both subtle mixing continua and discrete clusters, but does not offer physical interpretability. Joint characterization (

**c**) uses the physical meaning of the Substrate mixture fraction to contextualize the subtle statistical relationships captured by UMAP. Example regions of interest are identified from the joint space and projected onto the ternary mixing and UMAP spaces. Mean spectra for each region (

**d**) illustrate similarities and differences among statistically distinct clusters. Clusters identified by joint characterization also frequently show geographic coherence (

**e**).

**Figure 15.**SVD fractions summarized by land cover type. Sands are dominated by S. Other geologic scenes show more mixing towards D. Urban, senescent, and agriculture show increasing mixing towards V, respectively. Forests, mangroves, and tundra then show decreasing S and increased skew towards binary V ↔ D mixing, respectively.

**Figure 16.**UMAP summary. Urban and senescent show highly continuous manifolds. Forests, agriculture, tundra, and mangrove show increasing clustering/decreasing continuity, respectively. Of the geologic scenes, rocks and alluvium show more continuous manifolds, with lava/ash and sands showing highly sinuous, clustered manifolds with a large number of distinct apexes.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Sousa, D.; Small, C.
Joint Characterization of Sentinel-2 Reflectance: Insights from Manifold Learning. *Remote Sens.* **2022**, *14*, 5688.
https://doi.org/10.3390/rs14225688

**AMA Style**

Sousa D, Small C.
Joint Characterization of Sentinel-2 Reflectance: Insights from Manifold Learning. *Remote Sensing*. 2022; 14(22):5688.
https://doi.org/10.3390/rs14225688

**Chicago/Turabian Style**

Sousa, Daniel, and Christopher Small.
2022. "Joint Characterization of Sentinel-2 Reflectance: Insights from Manifold Learning" *Remote Sensing* 14, no. 22: 5688.
https://doi.org/10.3390/rs14225688