Next Article in Journal
Possible Tests of Fundamental Physics with GINGER
Previous Article in Journal
Generating Stellar Spectra Using Neural Networks
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Conference Report

Beyond mirkwood: Enhancing SED Modeling with Conformal Predictions

ML Collective, 22 Saturn St., San Francisco, CA 94114, USA
Astronomy 2024, 3(1), 14-20;
Submission received: 28 December 2023 / Revised: 6 February 2024 / Accepted: 8 February 2024 / Published: 10 February 2024


Traditional spectral energy distribution (SED) fitting techniques face uncertainties due to assumptions in star formation histories and dust attenuation curves. We propose an advanced machine learning-based approach that enhances flexibility and uncertainty quantification in SED fitting. Unlike the fixed NGBoost model used in mirkwood, our approach allows for any scikit-learn-compatible model, including deterministic models. We incorporate conformalized quantile regression to convert point predictions into error bars, enhancing interpretability and reliability. Using CatBoost as the base predictor, we compare results with and without conformal prediction, demonstrating improved performance using metrics such as coverage and interval width. Our method offers a more versatile and accurate tool for deriving galaxy physical properties from observational data.

1. Introduction

Spectral energy distributions (SEDs) are pivotal in astrophysics for understanding the intrinsic properties of galaxies, such as stellar mass, age distributions, star formation rates, and dust content. Traditional SED fitting methods, while insightful, often face significant challenges. These challenges stem from the complex nature of galaxies, including diverse star formation histories and varying dust attenuation curves [1,2,3,4,5,6]. The inherent uncertainties in these aspects can significantly affect the accuracy of derived galaxy properties, thus impacting our broader understanding of galactic evolution and formation.
Recent advancements in computational methods have opened new avenues in this field. Machine learning (ML), with its ability to handle large datasets and uncover complex patterns, has emerged as a powerful tool in SED fitting [1,2,7,8]. The traditional parametric and often linear approaches are being supplemented, and in some cases replaced, by non-parametric, highly flexible ML techniques that can model the non-linear relationships intrinsic to astronomical data more effectively [9,10,11,12,13]. This paradigm shift is not just a matter of computational convenience but represents a fundamental change in how we interpret vast and complex astronomical datasets.
This paper introduces an innovative approach that builds upon and significantly expands the capabilities of the mirkwood [1,2,3], a machine learning-based application previously developed for SED fitting. Our method enhances the flexibility and depth of analysis by enabling the use of any scikit-learn-compatible model [14]. This includes not only probabilistic models but also deterministic ones, thereby broadening the scope of application to a wider range of astronomical problems. Moreover, we integrate the uncertainty quantification method technique of conformalized quantile regression (CQR) [15], which allows us to translate point predictions into meaningful error bars. This addition is crucial in fields like astronomy, where quantifying the uncertainty of predictions is as important as the predictions themselves. The combination of these advanced techniques positions our tool at the forefront of SED fitting technologies, offering a more nuanced and comprehensive understanding of galaxy properties.
In the context of SED fitting, the ability to quantify uncertainty is essential for several reasons. First, it enables astronomers to distinguish between variations in galactic properties that are due to inherent physical processes versus those arising from observational limitations. Secondly, in fields such as cosmology, where the accurate determination of galaxy properties impacts our understanding of the universe’s evolution, refined uncertainty quantification offers a way to assess the reliability of these large-scale inferences. Thus, enhancing the precision of uncertainty quantification in SED fitting directly contributes to our fundamental understanding of the universe.

2. Background

The integration of machine learning (ML) into spectral energy distribution (SED) fitting marks a significant departure from conventional methodologies. Traditional SED fitting methods predominantly rely on parametric models to interpret observational data. These approaches, while foundational, are inherently constrained by the assumptions embedded within the parametric models and can be computationally demanding [16,17]. The emergence of ML offers a paradigm shift, enabling the exploration of complex, non-linear relationships within large astronomical datasets with enhanced efficiency and less reliance on predefined assumptions.
Among the plethora of ML algorithms, CatBoost stands out for its capability to efficiently manage categorical variables—a frequent characteristic of astronomical data. As a sophisticated form of gradient boosting, CatBoost incrementally constructs an ensemble of decision trees to improve predictive accuracy, particularly excelling in handling datasets that include a mixture of categorical and continuous features [18]. This attribute renders CatBoost particularly apt for SED fitting, where the dataset encompasses a diverse array of photometric bands and derived galaxy characteristics.
Adding to the methodological innovation, conformalized quantile regression (CQR) introduces a robust mechanism for uncertainty quantification. CQR extends beyond traditional point predictions by generating prediction intervals that encapsulate the expected range of outcomes with a given confidence level. This method aligns with the demands of astronomical research, where precise uncertainty estimation is paramount for the reliable interpretation of data regarding celestial bodies [19].
Our approach synergizes the adaptive model selection capability provided by the scikit-learn framework with the advanced uncertainty quantification offered by CQR. This combination heralds a significant advancement in SED fitting techniques, facilitating a deeper and more accurate delineation of galactic properties. By embracing this novel methodology, we not only enhance the fidelity of our predictions but also gain critical insights into the confidence levels associated with these predictions, addressing a fundamental challenge in astronomy—interpreting data that is often incomplete or contaminated by noise.

3. Data

Our training and testing datasets are derived from three advanced cosmological galaxy formation simulations, known for their accurate representation of galaxy physical properties, including authentic star formation histories. These simulations—Simba [20], Eagle [21,22,23], and IllustrisTNG [24]—provide a comprehensive and realistic variety of galaxy evolution scenarios, with sample sizes of 1688, 4697, and 9633 respectively. We focus on galaxies at redshift 0, representing them in their current state in the simulations. The spectral energy distributions (SEDs) in our datasets consist of 35 flux density measurements (in Jansky units) across different wavelengths, representing the luminosity of galaxies. These SEDs serve as the input features for our model. The target outputs or labels are the four scalar galaxy properties—galaxy mass, metallicity, dust mass, and star formation rate. See Table 1 in Gilda et al. [1,2,3], Narayanan et al. [4] for an overview of the distribution of these properties for all three simulations.

4. Methodology

4.1. Data Preprocessing

The foundation of any robust machine learning model is high-quality data. In our approach, we begin with a thorough data preprocessing phase. This involves cleaning the data, handling missing values, normalizing photometric fluxes, and encoding categorical variables where necessary. The preprocessing steps are critical in ensuring that the input data fed into the ML models are consistent, standardized, and reflective of the underlying physical phenomena.
We manually add Gaussian noise to the SEDs from the 3 simulations, to get three separate sets of data at signal-to-niose (SNR) ratios of 20, 10, and 5. We do this for a 1:1 comparison with the methodology and results of Gilda et al. [1,2,3], Narayanan et al. [4].

4.2. Model Selection and Flexibility

Our methodology is characterized by its flexibility and adaptability in model selection. While mirkwood was initially designed around NGBoost, we expand its capabilities by enabling the use of any scikit-learn-compatible model. mirkwood is capable of taking in a galaxy SED in tabular form—with each row corresponding to a different SED and each column the flux in a different filter—and in a chained fashion, extracting that galaxy’s mass, dust mass, and metallicity. In this work, we provide support for deterministic models such as Support Vector Machines [25] and Random Forests [26], alongside probabilistic models like Gaussian Processes [27]. In fact, our code is flexible enough to allow a pipeline consisting of an arbitrary number of sklearn-compatible models. This flexibility allows astronomers to tailor the predictive models to their specific research needs and the characteristics of their datasets.

4.3. CatBoost as the Base Predictor

CatBoost, our chosen base predictor, is particularly well-suited for dealing with the types of datasets common in astronomical research. It efficiently handles categorical features and large datasets, reducing overfitting and improving predictive accuracy. In our implementation, we fine-tune CatBoost’s parameters, such as the depth of trees and learning rate, to optimize its performance for SED fitting tasks.

4.4. Incorporating Conformalized Quantile Regression

A significant enhancement in our methodology is the incorporation of conformalized quantile regression. This technique allows us to convert the point predictions from our models into prediction intervals. These intervals provide a statistical measure of the uncertainty in our predictions, giving us a range within which the true value of the predicted property is likely to fall, at a given confidence level. Implementing this technique involves calibrating our models to estimate the quantiles of the predictive distribution, a crucial step in providing reliable and interpretable error estimates. Since [1] predict 1 σ error bars, for apples-to-apples comparison we set the significance level α at 0.318 . We use loss function quantile with CatBoost, and wrap the trained model within the MapieQuantileRegressor class from MAPIE 1.

4.5. Training and Validation

The final phase of our methodology involves training the machine learning models on a carefully curated dataset and validating their performance. For apples-to-apples comparison with [1], the training set contains all 10,073 samples from IllustrisTNG, 4697 samples from Eagle, and 359 samples from Simba selected via stratified 5-fold CV (see Section 3 in Gilda et al. [1] for details). After making inference on all test splits, we collate the results, thus successfully predicting all four galaxy properties for all 1797 samples from Simba. Each predicted output for a physical property contains two values—the mean and the standard deviation.
In the fitting process, we first train the model using galaxy flux values to predict stellar mass. Then, we use the predicted stellar masses, combined with the original flux values, to predict dust mass, and continue this sequential prediction process for other parameters. See Figure 3 and Section 3 in Gilda et al. [1] for details.
Through this comprehensive methodology, we aim to provide a powerful, flexible, and accurate tool for SED fitting, capable of handling the complexities and uncertainties inherent in astronomical datasets.

5. Comparative Analysis and Results

5.1. Comparative Analysis Methodology

To demonstrate the efficacy of our approach, we conducted a comprehensive comparative analysis. This involved comparing the performance of our enhanced tool against a traditional SED fitting method tool, prospector [28] and the original mirkwood implementation. We focused on the same five performance metrics as in Gilda et al. [1] to evaluate the accuracy of derived galaxy properties (galactic mass, dust mass, star formation rate, and metallicity), and the robustness of the model against variations in input data.

5.2. Performance Metrics

We use both deterministic and probabilistic metrics for comparison, the same five metrics used in Gilda et al. [1]—normalized root mean squared error (NRMSE), normalized mean absolute error (NMAE), normalized bias error (NBE), average coverage error (ACE), and interval sharpness (IS). These are defined and described in detail in Section 3.2 of Gilda et al. [1]. In particular, coverage is the proportion of true values that fall within the predicted error bars, offering a measure of the reliability of our uncertainty quantification. On the other hand, IW is the average width of the prediction intervals, which provides insight into the precision of our predictions.

5.3. Results

To evaluate our proposed model for SED fitting, we conduct comparisons with fits obtained in Gilda et al. [1] from the Bayesian SED fitting software prospector, and their new machine learning tool mirkwood. We provide each of the three models (their two plus our upgraded version of mirkwood) with identical data to deduce galaxy properties. This data comprises broadband photometry across 35 bands, subject to Gaussian uncertainties of 5 % , 10 % , and  20 % (corresponding to signal-to-noise ratios (SNRs) of 20, 10, and 5, respectively). In Table 1, Table 2 and Table 3 we showcase the outcomes from all three methods for all four galaxy properties.
The results of our comparative analysis are illuminating and encouraging. Our method consistently achieves higher coverage rates compared to both the other methods, indicating more reliable uncertainty quantification. At the same time, the prediction intervals generated by our method were narrower on average, signifying more precise predictions.
These results underline the superiority of our approach in terms of both accuracy and reliability in SED fitting. By leveraging the power of CatBoost and the precision of conformalized quantile regression, our method not only enhances the accuracy of point predictions but also provides a more nuanced understanding of the associated uncertainties.

5.4. Discussion

The improvements observed in our analysis can be attributed to several factors. The flexibility in model selection allows for better adaptation to the specific characteristics of astronomical datasets. CatBoost’s superior ability to work with tabular data effectively captures the complexities in the data, leading to more accurate predictions. The addition of conformalized quantile regression introduces a robust method for uncertainty quantification, a critical aspect often overlooked in traditional SED fitting.
Overall, the comparative analysis and the results obtained highlight the potential of our method in transforming the field of SED fitting, providing astronomers with a tool that is not only accurate but also comprehensive in its assessment of uncertainties.

6. Conclusions and Future Work

This study marks a substantial advancement in the field of spectral energy distribution (SED) fitting by integrating flexible machine learning models, particularly CatBoost, with the innovative technique of conformalized quantile regression. This approach not only enhances the accuracy of SED fitting but also introduces a new depth to the uncertainty quantification in astronomical research. The adaptability of our tool to various astronomical datasets, coupled with the ability to select from a range of scikit-learn-compatible models, ensures its applicability across different research contexts. CatBoost’s effectiveness in handling complex datasets, combined with our sophisticated method of uncertainty quantification, allows for more reliable and nuanced interpretations of galactic properties.
Our comparative analysis highlights the superiority of this method over traditional approaches, demonstrating improvements in both the accuracy of predictions and the understanding of associated uncertainties. This dual capability represents a significant stride in astronomy, offering a more reliable and comprehensive tool for exploring the universe.
Looking ahead, the potential for further advancements and extensions of our tool is vast. Future work may involve exploring the integration of additional machine learning models, such as deep learning architectures, to enhance predictive power and versatility. Testing and optimizing the tool on larger and more diverse datasets from upcoming astronomical surveys will be crucial for assessing its scalability and robustness. Further development in feature engineering and expanding the scope of uncertainty quantification could unlock new insights and details in SED fitting. Additionally, applying this tool to related fields like exoplanet studies or cosmic structure formation could demonstrate its adaptability and contribute to a broader range of scientific inquiries.


This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The author declares no conflicts of interest.


1 (accessed on 10 January 2024).


  1. Gilda, S.; Lower, S.; Narayanan, D. MIRKWOOD: Fast and accurate SED modeling using machine learning. Astrophys. J. 2021, 916, 43. [Google Scholar] [CrossRef]
  2. Gilda, S.; Lower, S.; Narayanan, D. MIRKWOOD: SED Modeling Using Machine Learning; Astrophysics Source Code Library, Record ascl:2102.017. 2021. Available online: (accessed on 2 January 2024).
  3. Gilda, S.; Lower, S.; Narayanan, D. SED Analysis using Machine Learning Algorithms. Am. Astron. Soc. Meet. Abstr. 2021, 53, 119.03. [Google Scholar]
  4. Narayanan, D.; Gilda, S.; Lower, S. SED Fitting in the Modern Era: Fast and Accurate Machine-Learning Assisted Software. HST Proposal. Cycle 29, ID. #16626. 2021. Available online: (accessed on 2 January 2024).
  5. Acquaviva, V.; Raichoor, A.; Gawiser, E. Simultaneous estimation of photometric redshifts and sed parameters: Improved techniques and a realistic error budget. Astrophys. J. 2015, 804, 8. [Google Scholar] [CrossRef]
  6. Simha, V.; Weinberg, D.H.; Conroy, C.; Dave, R.; Fardal, M.; Katz, N.; Oppenheimer, B.D. Parametrising Star Formation Histories. arXiv 2014, arXiv:1404.0402. [Google Scholar]
  7. Gilda, S.; de Mathelin, A.; Bellstedt, S.; Richard, G. Unsupervised Domain Adaptation for Constraining Star Formation Histories. arXiv 2021, arXiv:2112.14072. [Google Scholar] [CrossRef]
  8. Chu, J.; Tang, H. Galaxy stellar and total mass estimation using machine learning. arXiv 2023, arXiv:2311.10351. [Google Scholar] [CrossRef]
  9. Gilda, S. deep-REMAP: Parameterization of Stellar Spectra Using Regularized Multi-Task Learning. arXiv 2023, arXiv:2311.03738. [Google Scholar] [CrossRef]
  10. Gilda, S.; Ge, J.; MARVELS. Parameterization of MARVELS Spectra Using Deep Learning. Am. Astron. Soc. Meet. Abstr. 2018, 231, 349.02. [Google Scholar]
  11. Gilda, S.; Draper, S.C.; Fabbro, S.; Mahoney, W.; Prunet, S.; Withington, K.; Wilson, M.; Ting, Y.S.; Sheinis, A. Uncertainty-aware learning for improvements in image quality of the Canada–France–Hawaii Telescope. Mon. Not. R. Astron. Soc. 2021, 510, 870–902. [Google Scholar] [CrossRef]
  12. Gilda, S.; Ting, Y.S.; Withington, K.; Wilson, M.; Prunet, S.; Mahoney, W.; Fabbro, S.; Draper, S.C.; Sheinis, A. Astronomical Image Quality Prediction based on Environmental and Telescope Operating Conditions. arXiv 2020, arXiv:2011.03132. [Google Scholar] [CrossRef]
  13. Gilda, S. Feature Selection for Better Spectral Characterization or: How I Learned to Start Worrying and Love Ensembles. Astron. Data Anal. Softw. Syst. XXVIII 2019, 523, 67. [Google Scholar]
  14. Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
  15. Romano, Y.; Patterson, E.; Candes, E. Conformalized quantile regression. Adv. Neural Inf. Process. Syst. 2019, 32. [Google Scholar] [CrossRef]
  16. Walcher, J.; Groves, B.; Budavári, T.; Dale, D. Fitting the integrated spectral energy distributions of galaxies. Astrophys. Space Sci. 2011, 331, 1–51. [Google Scholar] [CrossRef]
  17. Conroy, C. Modeling the panchromatic spectral energy distributions of galaxies. Annu. Rev. Astron. Astrophys. 2013, 51, 393–455. [Google Scholar] [CrossRef]
  18. Dorogush, A.V.; Ershov, V.; Gulin, A. CatBoost: Gradient boosting with categorical features support. arXiv 2018, arXiv:1810.11363. [Google Scholar]
  19. Shafer, G.; Vovk, V. A Tutorial on Conformal Prediction. J. Mach. Learn. Res. 2008, 9, 371–421. [Google Scholar]
  20. Davé, R.; Anglés-Alcázar, D.; Narayanan, D.; Li, Q.; Rafieferantsoa, M.H.; Appleby, S. SIMBA: Cosmological simulations with black hole growth and feedback. Mon. Not. R. Astron. Soc. 2019, 486, 2827–2849. [Google Scholar] [CrossRef]
  21. Schaye, J.; Crain, R.A.; Bower, R.G.; Furlong, M.; Schaller, M.; Theuns, T.; Dalla Vecchia, C.; Frenk, C.S.; McCarthy, I.G.; Helly, J.C.; et al. The EAGLE project: Simulating the evolution and assembly of galaxies and their environments. Mon. Not. R. Astron. Soc. 2015, 446, 521–554. [Google Scholar] [CrossRef]
  22. Schaller, M.; Dalla Vecchia, C.; Schaye, J.; Bower, R.G.; Theuns, T.; Crain, R.A.; Furlong, M.; McCarthy, I.G. The EAGLE simulations of galaxy formation: The importance of the hydrodynamics scheme. Mon. Not. R. Astron. Soc. 2015, 454, 2277–2291. [Google Scholar] [CrossRef]
  23. McAlpine, S.; Helly, J.C.; Schaller, M.; Trayford, J.W.; Qu, Y.; Furlong, M.; Bower, R.G.; Crain, R.A.; Schaye, J.; Theuns, T.; et al. The EAGLE simulations of galaxy formation: Public release of halo and galaxy catalogues. Astron. Comput. 2016, 15, 72–89. [Google Scholar] [CrossRef]
  24. Vogelsberger, M.; Genel, S.; Springel, V.; Torrey, P.; Sijacki, D.; Xu, D.; Snyder, G.; Nelson, D.; Hernquist, L. Introducing the Illustris Project: Simulating the coevolution of dark and visible matter in the Universe. Mon. Not. R. Astron. Soc. 2014, 444, 1518–1547. [Google Scholar] [CrossRef]
  25. Schölkopf, B.; Burges, C.J.; Smola, A.J. (Eds.) Advances in Kernel Methods: Support Vector Learning; MIT Press: Cambridge, MA, USA, 1999. [Google Scholar]
  26. Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
  27. Rasmussen, C.E.; Williams, C.K.I. Gaussian processes for machine learning. In Adaptive Computation and Machine Learning; MIT Press: Cambridge, MA, USA, 2006; pp. 1–248. [Google Scholar]
  28. Johnson, B.D.; Leja, J.L.; Conroy, C.; Speagle, J.S. Prospector: Stellar Population Inference from Spectra and SEDs; Astrophysics Source Code Library, Record ascl:1905.025. 2019. Available online: (accessed on 2 January 2024).
Table 1. Comparative performance of our proposed method v/s mirkwood v/s Prospector across different metrics, for data with SNR = 20. The five metrics are the normalized root mean squared error (NRMSE), normalized mean absolute error (NMAE), normalized bias error (NBE), average coverage error (ACE), and interval sharpness (IS). A bold value denotes the best metric for that galaxy property. A value of ‘nan’ represents lack of predictions from Prospector. We do not have predicted error bars from Prospector for dust mass, hence ACE and IS values corresponding to this property are ‘nan’s. Down arrows imply that lower metric values are better.
Table 1. Comparative performance of our proposed method v/s mirkwood v/s Prospector across different metrics, for data with SNR = 20. The five metrics are the normalized root mean squared error (NRMSE), normalized mean absolute error (NMAE), normalized bias error (NBE), average coverage error (ACE), and interval sharpness (IS). A bold value denotes the best metric for that galaxy property. A value of ‘nan’ represents lack of predictions from Prospector. We do not have predicted error bars from Prospector for dust mass, hence ACE and IS values corresponding to this property are ‘nan’s. Down arrows imply that lower metric values are better.
ModelNRMSE (↓)NMAE (↓)NBE (↓)ACE (↓)IS (↓)
This paper0.0090.074−0.031−0.0510.001
This paper0.4120.298−0.157−0.0410.001
Dust Massmirkwood0.4750.336−0.215−0.0760.001
This paper0.0440.048−0.009−0.0530.016
This paper0.2230.147−0.0470.0140.004
Table 2. Same as Table 1, but for SNR = 10. Down arrows indicate that lower metric values are better.
Table 2. Same as Table 1, but for SNR = 10. Down arrows indicate that lower metric values are better.
ModelNRMSE (↓)NMAE (↓)NBE (↓)ACE (↓)IS (↓)
This paper0.0920.071−0.026−0.0180.001
This paper0.3910.254−0.1430.0120.001
Dust Massmirkwood0.4560.332−0.209−0.0330.001
This paper0.0370.0490.0070.0210.023
This paper0.2740.114−0.0700.0270.001
Table 3. Same as Table 1, but for SNR = 5. Down arrows indicate that lower metric values are better.
Table 3. Same as Table 1, but for SNR = 5. Down arrows indicate that lower metric values are better.
ModelNRMSE (↓)NMAE (↓)NBE (↓)ACE (↓)IS (↓)
This paper0.1210.062−0.031−0.0010.001
This paper0.3150.224−0.1540.0020.001
Dust Massmirkwood0.4800.339−0.2190.0030.001
This paper0.0490.048−0.005−0.0130.034
This paper0.1890.171−0.0430.0610.001
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Gilda, S. Beyond mirkwood: Enhancing SED Modeling with Conformal Predictions. Astronomy 2024, 3, 14-20.

AMA Style

Gilda S. Beyond mirkwood: Enhancing SED Modeling with Conformal Predictions. Astronomy. 2024; 3(1):14-20.

Chicago/Turabian Style

Gilda, Sankalp. 2024. "Beyond mirkwood: Enhancing SED Modeling with Conformal Predictions" Astronomy 3, no. 1: 14-20.

Article Metrics

Back to TopTop