# Accelerating Energy-Economic Simulation Models via Machine Learning-Based Emulation and Time Series Aggregation

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Literature Review

#### 2.1. Introduction of Emulation, Surrogate, and Meta-Models

#### 2.2. Modeling Process

#### 2.3. Applications of Emulation, Surrogate, and Meta-Models

#### 2.4. Importance of Sampling Method and Sample Size

#### 2.5. Time Series Aggregation with Emulation-, Surrogate-, and Meta-Models

#### 2.6. Conclusion and Paper Contribution

- To our knowledge, the use of sampling methods to select viable training data and the application of TSA to reduce simulation time for generating training data have not been evaluated in conjunction with each other. We contribute by combining three concepts (sampling, TSA, emulation) and showing their synergies both on the simulation and training time as well as in terms of accuracy.
- Sampling methods on probability spaces—as commonly used in the literature—do not apply to our problem. Hence, we introduce sampling methods for finite populations and compare them to simple random sampling in terms of their impact on ML accuracy.
- TSA, as used in the literature, is applied to reduce both input and output complexity. In our contribution, we apply TSA as a means of sampling of time series to train our ML, but still predict the entire time series with the emulation (ML) model.
- In the examined literature the focus is set on the improvements in the reapplication of the emulation model, however optimizations of the simulation model are neglected. In an integrated approach, we show how TSA can help to both reduce the computation time of the simulation model to a minimum and to improve the training time of the emulation. This helps to speed up the overall process. The improvements are compared in Section 6.4.
- We apply the methodology, including an intelligent sampling method, TSA, and emulation, on a practical use case to calculate prices in approximately 12,000 German municipalities.

## 3. Methodology

## 4. Sampling

#### 4.1. Method of Comparison

#### 4.2. Sampling Methods

#### 4.2.1. Simple Random Sampling (SRS)

#### 4.2.2. Stratified Sampling

#### 4.2.3. Balanced Sampling According to Tipton (2014)

#### 4.2.4. k-Means Cluster Sampling

#### 4.2.5. Adaptive Sampling

#### 4.3. Input Data

#### 4.3.1. Dataset 1: Regional Direct Marketing

#### 4.3.2. Dataset 2: Flexibility Potential

#### 4.3.3. Dataset 3: Electricity Price Prediction

#### 4.4. Interpretation of Results

## 5. Clustering-Based Time Series Aggregation

#### 5.1. Model Evaluation

#### 5.2. Cluster Validation

#### 5.3. Energy-Economic Result Interpretation

- Load/consumption is defined as the sum of all consumption within a municipality, regardless of own consumption within households.
- Generation is defined as the sum of all generated energy within a municipality, regardless of own consumption within households.
- Demand is defined as the sum of the remaining load after own consumption of all prosumers.
- Supply is defined as sum of the remaining feed-in of electricity of all prosumers after own consumption.

## 6. Case Study: Peer-to-Peer-Prices in German Energy-Sharing Communities

#### 6.1. Pricing Mechanisms in Energy Communities

_{2}emissions [67].

#### 6.1.1. Supply and Demand Ratio (SDR)

#### 6.1.2. Mid-Market Rate Pricing (MMR)

#### 6.2. Simulation Model Description and Input Data

#### 6.2.1. Model Description

#### 6.2.2. Data Sources

#### 6.2.3. Simulated Scenario

#### 6.2.4. Time Complexity

#### 6.3. Emulation Model

#### 6.3.1. Regression Model

#### 6.3.2. Training Data and Sampling Method

#### 6.4. Model Validation

#### 6.4.1. Model Validation Method

#### 6.4.2. Emulation Results and Impact of Time Series Aggregation on Model Performance

#### 6.4.3. Improvements of Simulation Time

#### 6.5. Energy-Economic Result Interpretation

## 7. Discussion

- The importance of sampling methods was shown in three energy-economic use cases in Section 0. The results showed better results for cluster and adaptive sampling than for simple random sampling. Other possible sampling methods might achieve equal or better results. Further research on other datasets and additional sampling methods should be conducted in the future to further confirm these results.
- The analysis should be conducted not only on other datasets, but also with more and different ML algorithms to show possible advantages or disadvantages of the combinations of certain ML algorithms and sampling methods.
- In Sections 0 and 0, we applied cluster-based time series aggregation to our models and achieved an acceleration of simulation and training while still retaining good overall accuracy. This should be further investigated on other datasets and use cases. Additionally, we intend to investigate the optimal combination of typical hours (TSA) and sample size for training in future research.
- In Section 0, we used TSA to find typical hours, since there were no dependencies of the time steps to each other. In future cases with e.g., battery storage optimization, this is not viable. Instead, typical weeks or days need to be identified that still retain dependencies of the time steps to each other for a certain, representative period.
- Since the available processing speed was limited (and no GPUs were available), the ML algorithm used in Section 0 to emulate our simulation model was the random forest regressor. The results were generated using a grid search. However, other algorithms (e.g., deep learning) might yield better results. The impact of other algorithms (on dedicated ML hardware) on the emulation performance (runtime and accuracy) will be tested in the future.
- The energy-economic result interpretation in Section 0 was reduced to a necessary minimum for reasons of the scope of this paper. In further publications, we will provide deeper insight into the effect of different pricing mechanisms on P2P energy-sharing communities.
- In [7], multi-fidelity surrogate models are proposed to improve the modeling result by adding low-fidelity model results with low computational cost into the training process. In further research, we will examine the impact of this approach on other use cases for emulation modeling.
- For reasons of scope, only pv communities with constant export prices were considered in this paper. Other renewables, as well as possible flexibilities, were not considered. The effects of different scenarios, flexibility (i.e., community battery storages, bidirectional electric vehicles, smart battery, and thermal storage, etc.), volatile export prices, and other renewable generators (as done in [65]) on prices will also be evaluated in future publications. Additionally, costs for e.g., a community operator were excluded. Business models, etc., will be evaluated in the InDEED project.
- The simulation framework was not built to achieve optimal runtime improvements if only certain time steps are generated. This will be optimized to achieve the full potential, as identified in Section 0.

## 8. Summary and Outlook

## Author Contributions

## Funding

## Institutional Review Board Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## Abbreviations

ANN | Artificial Neural Networks |

APE | Absolute Percentage Error |

BEV | battery electric vehicles |

BMWi | German Federal Ministry for Economic Affairs and Energy |

DER | distributed energy resources |

FREM | FfE regionalized energy system modeling tool |

GPR | Gaussian Progress Regression |

GPU | graphics processing unit |

HF | high-fidelity |

HSS | home storage systems |

LF | low-fidelity |

LHS | Latin Hypercube Sampling |

LSLPP | large-scale learning and prediction process |

MAE | Mean Absolute Error |

MaStR | Marktstammdatenregister |

ML | machine learning |

MLP | multi-layer perceptron |

MMR | Mid-Market Rate Pricing |

nRMSE | normalized RMSE |

P2P | peer-to-peer |

PtH | power-to-heat |

PV | photovoltaiks |

RDM | regional direct marketing |

RF | Random Forests |

RMSE | Root Mean Squared Error |

SDR | Supply and Demand Ratio |

SRS | Simple Random Sampling |

SVR | Support Vector Regression |

TSA | time series aggregation |

TSAM | Time Series Aggregation Module |

## Appendix A

## Appendix B

Hard- & Software | Used Specifications |
---|---|

Operating System | Windows Server 2016 Standard |

CPU | Intel Xenon CPU E5-2680 |

Number of (utilized) cores | 10 |

Total RAM | 496 GB |

GPU | NA |

Hard- & Software | Used Specifications |
---|---|

Operating System | Linux openSUSE Leap 42.3 |

CPU | Intel Xenon CPU E5-2698 v4 2.2 GHz |

Number of (utilized) cores | 40 |

Total RAM | 792 GB |

GPU | NA |

## Appendix C

**Figure A2.**Simplified visualization of the hierarchy and attributes of classes used for generating a digital representation of the assets inside a municipality using the preprocessing module of the simulation framework. Each class inherits all attributes of its parent class.

## Appendix D

**Figure A3.**MAE of the trained ML model (supply) depending on mean generation and consumption within each municipality in the test set.

**Figure A4.**MAE of the trained ML model (demand) depending on mean generation and consumption within each municipality in the test set.

## Appendix E

**Table A3.**Results of training an MLP on the same sample of the P2P-pricing dataset using the sklearn.neural_network.MLPRegressor class. The MLP consists of three hidden layers of 100, 200, and 50 neurons each and uses an adaptive learning rate starting with an initial value of 0.001. All other parameters are the default parameters of sklearn.

Error Metric MLP | Supply | Demand | |
---|---|---|---|

TSA | TSA | ||

Test | MAE | 48.351 | 84.57 |

R^{2} | 0.987 | 0.915 | |

Benchmark | MAE | 51.042 | 97.10 |

R^{2} | 0.985 | 0.973 |

## References

- Degeling, K.; IJzerman, M.J.; Koffijberg, H. A scoping review of metamodeling applications and opportunities for advanced health economic analyses. Expert Rev. Pharm. Outcomes Res.
**2018**, 19, 181–187. [Google Scholar] [CrossRef] [PubMed] - Köhnen, C.; Priesmann, J.; Nolting, L.; Kotzur, L.; Robinius, M.; Praktiknjo, A. The potential of deep learning to reduce complexity in energy system modeling. Int. J. Energy Res.
**2021**, 1–22. [Google Scholar] [CrossRef] - McGregor, I. The Relationship between Simulation and Emulation. In Proceedings of the 2002 Winter Simulation Conference, San Diego, CA, USA, 8–11 December 2002; Brooks-PRI Automation Inc.: Salt Lake City, UT, USA, 2002. [Google Scholar]
- Kasim, M.F.; Watson-Parris, D.; Deaconu, L.; Oliver, S.; Hatfield, P.; Froula, D.H.; Gregori, G.; Jarvis, M.; Khatiwala, S.; Korenaga, J. Building High Accuracy Emulators for Scientific Simulations with Deep Neural Architecture Search; University of Oxford: Oxford, UK, 2020. [Google Scholar]
- Chatterjee, S.; Hadi, A. Regression Analysis by Example, 5th ed.; New York University: New York, NY, USA, 2012. [Google Scholar]
- Roelofs, R. Measuring Generalization and Overfitting in Machine Learning. Ph.D. Thesis, University of California, Berkeley, CA, USA, 2019. [Google Scholar]
- Jiang, P.; Zhou, Q.; Shao, X. Surrogate Model-Based Engineering Design and Optimization; Springer Nature: Singapore, 2020. [Google Scholar]
- Dawson-Elli, N.; Lee, S.B.; Pathak, M.; Mitra, K.; Subramanian, V. Data Science Approaches for Electrochemical Engineers: An Introduction through Surrogate Model Development for Lithium-Ion Batteries. J. Electrochem. Soc.
**2018**, 165, A1–A15. [Google Scholar] [CrossRef] - Rajaram, D.; Puranik, T.G.; Renganathan, S.A.; Sung, W.; Fischer, O.P.; Marvis, D.N.; Ramamurthy, A. Empirical Assessment of Deep Gaussian Process Surrogate Models for Engineering Problems. J. Aircr.
**2021**, 58, 182–196. [Google Scholar] [CrossRef] - Yang, H.; Hong, S.H.; ZhG, R.; Wang, Y. Surrogate-based optimization with adaptive sampling for microfluidic concentration gradient generator design. RSC Adv.
**2020**, 10, 13799–13814. [Google Scholar] [CrossRef] - Ibrahim, M.; Al-Sobhi, S.; Mukherjee, R.; AlNouss, A. Impact of Sampling Technique on the Performance of Surrogate Models Generated with Artificial Neural Network (ANN): A Case Study for a Natural Gas Stabilization Unit. Energies
**2019**, 12, 1906. [Google Scholar] [CrossRef] [Green Version] - Dong, X.; Shen, J.; Wang, W.; Liu, Y.; Shao, L.; Porikli, F. Hyperparameter Optimization for Tracking with Continuous Deep Q-Learning. In Proceedings of the Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; Beijing Institute of Technology: Beijing, China, 2018. [Google Scholar]
- Liashchynskyi, P.; Liashchynskyi, P. Grid Search, Random Search, Genetic Algorithm: A Big Comparison for NAS; Ternopil National Economic University: Ternopil, Ukraine, 2019. [Google Scholar]
- Peterson, J.L.; Humbird, K.D.; Field, J.E.; Brandon, S.T.; Langer, S.H.; Nora, R.C.; Spears, B.K.; Springer, P.T. Zonal flow generation in inertial confinement fusion implosions. Phys. Plasmas
**2017**, 24, 032702. [Google Scholar] [CrossRef] - Kannari, L.; Kiljanger, J.; Piira, K.; Piippo, J.; Koponen, P. Building Heat Demand Forecasting by Training a Common Machine Learning Model with Physics-Based Simulator. Forecasting
**2021**, 3, 290–302. [Google Scholar] [CrossRef] - Testolina, P.; Lecci, M.; Rebato, M.; Testolino, A.; Gambini, J.; Flamini, R.; Mazzucco, C.; Zorzi, M. Enabling Simulation-Based Optimization through Machine Learning: A Case Study on Antenna Design. In Proceedings of the IEEE Global Communication Conference: Wireless Communicatio (GLOBECOM2019 WC), Waikoloa, HI, USA, 9–13 December 2019; University of Padova: Padova, Italy, 2019. [Google Scholar]
- Vazquez-Canteli, J.; Demir, A.D.; Brown, J.; Nagy, Z. Deep Neural Networks as Surrogate Models for Urban Energy Simulations. In Proceedings of the Journal of Physics: Conference Series Volume 1343, CISBAT 2019|Climate Resilient Cities—Energy Efficiency & Renewables in the Digital Era, Lausanne, Switzerland, 4–6 September 2019; École Polytechnique Fédérale de Lausanne (EPFL): Lausanne, Switzerland, 2019. [Google Scholar]
- Thiagarajan, J.J.; Venkatesh, B.; Anirudh, R.; Bremer, P.; Gaffney, J.; Anderson, G.; Spears, B. Designing accurate emulators for scientific processes using calibration-driven deep models. In Nature Communications; Lawrence Livermore National Laboratory: Livermore, CA, USA, 2020; Volume 11, p. 5622. [Google Scholar]
- Balduin, S. Surrogate models for composed simulation models in energy systems. In Proceedings of the 7th DACH+ Conference on Energy Informatics, Oldenburg, Germany, 11–12 October 2018; Institute of Information Technology: Oldenburg, Germany, 2018. [Google Scholar]
- Balduin, S.; Westermann, T.; Puiutta, E. Evaluating different machine learning techniques as surrogate for low voltage grids. In Proceedings of the 9th DACH+ Conference on Energy Informatics, Sierre, Switzerland, 29–30 October 2020; Springer Nature: Berlin, Germany, 2020. [Google Scholar]
- Monterrubio-Velasco, M.; Carrasco-Jimenez, J.C.; Rojas, O.; Rodriguez, J.E.; Modesto, D.; de la Puente, J. Source Parameter Sensitivity of Earthquake Simulations assisted by Machine Learning. In Proceedings of the EGU General Assembly 2021, Online, 19–30 April 2021; EGU21-5995. Barcelona Supercomputing Center, CASE: Barcelona, Spain, 2021. [Google Scholar]
- Deist, T.M.; Patti, A.; Wang, Z.; Krane, D.; Sorenson, T.; Craft, D. Simulation assisted machine learning. In Bioinformatics; Harvard Medical School: Boston, MA, USA, 2019; Volume 35, pp. 4072–4080. [Google Scholar]
- Pan, X.; You, Y.; Wang, Z.; Lu, C. Virtual to Real Reinforcement Learning for Autonomous Driving. In Proceedings of the British Machine Vision Conference, London, UK, 4–7 September 2017; University of California: Berkeley, CA, USA, 2017. [Google Scholar]
- Tesla, Inc. Tesla AI Day in 19 Minutes (Supercut). USA: Tesla Daily, 2021. Available online: https://www.youtube.com/watch?v=keWEE9FwS9o (accessed on 16 December 2021).
- Rupp, M.; Tkatchenko, A.; Müller, K.R.; von Lilienfeld, O.A. Fast and Accurate Modeling of Molecular Atomization Energies with Machine Learning. Phys. Rev. Lett.
**2012**, 108, 058301. [Google Scholar] [CrossRef] [PubMed] - Kim, B.; Azevedo, V.C.; Thuerey, N.; Kim, T.; Gross, M.; Solenthaler, B. Deep Fluids: A Generative Network for Parameterized Fluid Simulations. Comput. Graph. Forum
**2019**, 38, 59–70. [Google Scholar] [CrossRef] - Etemadi, N. On the Laws of Large Numbers for Nonnegative Random Variables. J. Multivar. Anal.
**1983**, 13, 187–193. [Google Scholar] [CrossRef] [Green Version] - Junlin, Y.; Moawad, A. Vehicle energy consumption estimation using large scale simulations and machine learning methods. Transp. Res. Part C Emerg. Technol.
**2019**, 101, 276–296. [Google Scholar] - Balki, I.; Amirabadi, A.; Levman, J.; Martel, A.L.; Emersic, Z.; Meden, B.; Garcia-Pedrero, A.; Ramirez, S.C.; Kong, D.; Moody, A.R.; et al. Sample-Size Determination Methodologies for Machine Learning in Medical Imaging Research: A Systematic Review. Can. Assoc. Radiol. J.
**2019**, 70, 344–353. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Davis, S.E.; Cremaschi, S.; Eden, M.R. Efficient Surrogate Model Development: Impact of Sample Size and Underlying Model Dimensions. Comput. Aided Chem. Eng.
**2018**, 44, 979–984. [Google Scholar] - Zahura, F.; Goodall, J.L.; Sadler, J.M.; Shen, Y.; Morsy, M.M.; Behl, M. Training Machine Learning Surrogate Models from a High-Fidelity Physics-Based Model: Application for Real-Time Street-Scale Flood Prediction in an Urban Coastal Community. Water Resour. Res.
**2020**, 56, e2019WR027038. [Google Scholar] [CrossRef] - Cai, Y.; Guan, K.; Peng, J.; Wang, S.; Seifert, C.; Wardlow, B.D.; Li, Z. A high-performance and in-season classification system of field-level crop types using time-series Landsat data and a machine learning approach. Remote Sens. Environ.
**2018**, 210, 35–47. [Google Scholar] [CrossRef] - Ahmad, T.; Chen, H. Potential of three variant machine-learning models for forecasting district level medium-term and long-term energy demand in smart grid environment. Energy
**2018**, 160, 1008–1020. [Google Scholar] [CrossRef] - Bonawitz, K.; Ivanov, V.; Kreuter, B.; Marcedone, A.; McMahan, H.B.; Patel, S.; Ramage, D.; Segal, A.; Seth, K. Practical Secure Aggregation for Privacy-Preserving Machine Learning. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, New York, NY, USA, 30 October–3 November 2017; The University of Texas at Dallas: Dallas, TX, USA, 2017. [Google Scholar]
- Konečný, J.; McMahan, H.B.; Yu, F.X.; Richtárik, P.; Suresh, A.T.; Bacon, D. Federated Learning: Strategies for Improving Communication Efficiency; University of Edinburgh: Edinburgh, UK, 2016. [Google Scholar]
- Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res.
**2011**, 12, 2825–2830. [Google Scholar] - Burg, T.; Kowarik, A.; Six, M.; Brancato, G.; Krapavickaité, D. Quality Guidelines for Frames in Social Statistics—ESSnet KOMUSO Quality in Multisource Statistics (Version 1.51, 2019-09-30); Eurostat: Brussels, Belgium, 2019.
- Dodge, Y. Sampling. In The Concise Encyclopedia of Statistics, 2008th ed.; Springer: New York, NY, USA, 2008. [Google Scholar]
- Dodge, Y. Stratified Sampling. In The Concise Encyclopedia of Statistics; Springer: New York, NY, USA, 2008. [Google Scholar]
- Wang, J.; Haining, R.; Cao, Z. Sample surveying to estimate the mean of a heterogeneous surface: Reducing the error variance through zoning. Int. J. Geogr. Inf. Sci.
**2010**, 24, 523–543. [Google Scholar] [CrossRef] - Dodge, Y. Cluster Sampling. In The Concise Encyclopedia of Statistics, 2008th ed.; Springer: New York, NY, USA, 2008. [Google Scholar]
- Bogensperger, A.; Fabel, Y. A practical approach to cluster validation in the energy sector. In Proceedings of the 10th DACH+ Conference on Energy Informatics, Virtual, 13–17 September 2021; INATECH—Albert-Ludwigs-Universität Freiburg: Freiburg, Germany, 2021. [Google Scholar]
- Tipton, E. Stratified Sampling Using Cluster Analysis: A Sample Selection Strategy for Improved Generalizations from Experiments. In Evaluation Review; Columbia University: New York, NY, USA, 2014; Volume 37. [Google Scholar]
- Syakur, M.A.; Khotimah, B.K.; Rochman, E.M.S.; Satoto, B.D. Integration K-Means Clustering Method and Elbow Method for Identification of The Best Customer Profile Cluster. IOP Conf. Ser. Mater. Sci. Eng.
**2018**, 336, 012017. [Google Scholar] [CrossRef] [Green Version] - Brus, D.J.; de Gruijter, J.J.; van Groeningen, J.W. Chapter 14 Designing Spatial Coverage Samples Using the k-means Clustering Algorithm. In Developments in Soil Science; Elsevier: Amsterdam, The Netherlands, 2006. [Google Scholar]
- Fuhg, J.N.; Fau, A.; Nackenhorst, U. State-of-the-art and Comparative Review of Adaptive Sampling Methods for Kriging. In Archives of Computational Methods in Engineering; Leibniz Universität Hannover, Université Paris-Saclay: Hannover, Germany; Paris, France, 2021; Volume 28. [Google Scholar]
- Settles, B. Active Learning Literature Survey—Technical Report #1648; University of Wisconsin Madison: Madison, WI, USA, 2009. [Google Scholar]
- Bamdad, K.; Cholette, M.E.; Bell, J. Building energy optimization using surrogate model and active sampling. J. Build. Perform. Simul.
**2020**, 13, 760–776. [Google Scholar] [CrossRef] - Corradini, R.; Konetschny, C.; Schmid, T. FREM—Ein regionalisiertes Energiesystemmodell. In et-Energiewirtschaftliche Tagesfragen Heft 1/2 2017; Forschungsstelle für Energiewirtschaft: München, Germany, 2017. [Google Scholar]
- Marktstammdatenregister-Öffentliche Einheitenübersicht. Bonn: Bundesnetzagentur. 2019. Available online: https://www.marktstammdatenregister.de/MaStR/Einheit/Einheiten/OeffentlicheEinheitenuebersicht (accessed on 7 March 2019).
- EEG-Anlagenstammdaten Zur Jahresabrechnung 2015. Berlin, Dortmund, Bayreuth, Stuttgart: Übertragungsnetzbetreiber (ÜNB), 2016. Available online: https://www.netztransparenz.de/EEG/Anlagenstammdaten (accessed on 27 December 2016).
- Schmid, T.; Jetter, F.; Limmer, T. Regionalisierung des Ausbaus der Erneuerbaren Energien—Begleitdokument Zum Netzentwicklungsplan Strom 2035 (Version 2021); Forschungsstelle für Energiewirtschaft e.V. (FfE): München, Germany, 2021. [Google Scholar]
- Heimerl, S.; Giesecke, J. Wasserkraftanteil an der elektrischen Stromerzeugung in Deutschland 2003. In Wasserwirtschaft (WaWi); Vieweg+Teubner Verlag: Wiesbaden, Germany, 2004. [Google Scholar]
- Fahrzeugzulassungen (FZ). Bestand an Kraftfahrzeugen und Kraftfahrzeuganhängern nach Zulassungsbezirken; 1 January 2021 (FZ1); Kraftfahrt-Bundesamt: Flensburg, Germany, 2021.
- Bundesamt für Kartographie und Geodäsie (BKG). Vektordaten Bundesrepublik Deutschland—Verwaltungsgrenzen 1:250,000 (VG250); Bundesamt für Kartographie und Geodäsie: Frankfurt am Main, Germany, 2009.
- OpenStreetMap (OSM)—OpenStreetMap und Mitwirkende. Cambridge: OpenStreetMap Foundation, 2004. Available online: http://www.openstreetmap.org/ (accessed on 14 October 2019).
- Statistisches Bundesamt. Zensus 2011—Ausgewählte Ergebnisse; Statistisches Bundesamt: Wiesbaden, Germany, 2013.
- Müller, M.; Reinhard, J.; Ostermann, A.; Estermann, T.; Köppl, S. Regionales Flexibilitäts-Potenzial dezentraler Anlagen—Modellierung und Bewertung des Regionalen Flexibilitäts-Potenzials von Dezentralen Flexibilitäts-Typen im Verteilnetz; Conexio GmbH: Berlin, Germany, 2019. [Google Scholar]
- Clinton, N. Energy Price Prediction [ML]. Mountain View: Kaggle Inc., 2021. Available online: https://www.kaggle.com/nigelclinton/energy-price-prediction-ml (accessed on 13 December 2021).
- Manjunath, M.; Zhang, Y.; Yeo, S.H.; Sobh, O.; Russell, N.; Followell, C.; Bushell, C.; Ravaioli, U.; Song, J.S. ClusterEnG: An interactive educational web resource for clustering and visualizing high-dimensional data. PeerJ Comput. Sci.
**2018**, 4, e155. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Rodriguez, M.Z.; Comin, C.H.; Casanove, D.; Bruno, O.M.; Amancio, D.R.; Costa, L.d.F.; Rodrigues, F.A. Clustering algorithms: A comparative approach. PLoS ONE
**2019**, 14, e0210236. [Google Scholar] [CrossRef] - Kumar, A.; Kumar, S. Density Based Initialization Method for K-Means Clustering Algorithm. Int. J. Intell. Syst. Appl.
**2017**, 9, 40–48. [Google Scholar] [CrossRef] [Green Version] - Kumar, P. Computational Complexity of ML Models. Cork: Analytics Vidhya, 2019. Available online: https://medium.com/analytics-vidhya/time-complexity-of-ml-models-4ec39fad2770 (accessed on 16 December 2021).
- Hoffmann, M.; Kotzur, L.; Stolten, D.; Robinius, M. A Review on Time Series Aggregation Methods for Energy System Models. Energies
**2020**, 13, 641. [Google Scholar] [CrossRef] [Green Version] - Bogensperger, A.; Ferstl, J.; Yu, Y. Comparison of Pricing Mechanisms in Peer-to-Peer Energy Communities. In 12th. Internationale Energiewirtschaftstagung (IEWT) 2021; Technische Universität Wien: Wien, Austria, 2021. [Google Scholar]
- Naser, M.Z.; Alavi, A. Insights into Performance Fitness and Error Metrics for Machine Learning; University of Clemson: Clemson, SC, USA, 2020. [Google Scholar]
- Bogensperger, A.; Zeiselmair, A. Updating renewable energy certificate markets via integration of smart meter data, improved time resolution and spatial optimization. In Proceedings of the 17th International Conference on the European Energy Market (EEM2020), Stockholm, Sweden, 16–18 September 2020; Forschungsstelle für Energiewirtschaft e.V.: München, Germany, 2020. [Google Scholar]
- Liu, N.; Yu, X.; Wang, C.; Li, C.; Ma, L.; Lei, J. An Energy Sharing Model with Price-based Demand Response for Microgrids of Peer-to-Peer Prosumers. In IEEE Transactions on Power Systems June 2017; North China Electric Power University: Beijing, China, 2017. [Google Scholar]
- Zhou, Y.; Wu, J.; Long, C. Evaluation of peer-to-peer energy sharing mechanisms based on a multiagent simulation framework. Appl. Energy
**2018**, 222, 993–1022. [Google Scholar] [CrossRef] - Long, C.; Wu, J.; Zahng, C.; Thomas, L.; Cheng, M.; Jenkins, N. Peer-to-Peer Energy Trading in a Community Microgrid. In Proceedings of the 2017 IEEE Power & Energy Society General Meeting, Chicago, IL, USA, 16–20 July 2017. [Google Scholar]
- Schmid, T. The FfE Regionalized Energy System Model (FREM); Forschungsstelle für Energiewirtschaft e.V. (FfE): Munich, Germany, 2014. [Google Scholar]
- Zensusdatenbank des Zensus 2011. Wiesbaden: Statistische Ämter des Bundes und der Länder, 2013. Available online: https://ergebnisse.zensus2011.de/ (accessed on 1 December 2021).
- Wohnungen und Gebäude je Hektar—Ergebnisse des Zensus am 9. Mai 2011 in Gitterzellen; Statistische Ämter des Bundes und der Länder: Wiesbaden, Germany, 2018.
- Haushalte im 100 Meter-Gitter—Ergebnisse des Zensus Am 9. Mai 2011 in Gitterzellen. Wiesbaden: Statistische Ämter des Bundes und der Länder, 2018. Available online: https://www.zensus2011.de/DE/Home/Aktuelles/DemografischeGrunddaten.html (accessed on 1 December 2021).
- Müller, M.; Biedenbach, F.; Reinhard, J. Development of an Integrated Simulation Model for Load and Mobility Profiles of Private Households. Energies
**2020**, 13, 3843. [Google Scholar] [CrossRef] - European Network of Transmission System Operators for Electricity: Transparency Platform. Laufende Aktualisierung Seit 2014. Available online: https://transparency.entsoe.eu/ (accessed on 1 December 2021).
- The Ecoinvent Database, Version 3.6. Zürich: Ecoinvent, 2019. Available online: www.ecoinvent.org (accessed on 1 December 2021).
- Fattler, S. Economic and Environmental Assessment of Electric Vehicle Charging Strategies. Ph.D. Thesis, Technische Universität München, Munich, Germany, 2021. Available online: https://mediatum.ub.tum.de/doc/1601943/1601943.pdf (accessed on 1 December 2021).
- Power Market Data. Paris: EPEX SPOT, 2019. Available online: https://www.epexspot.com/en/market-data/ (accessed on 16 May 2019).
- Netzentwicklungsplan Strom 2035, Version 2021—Zweiter Entwurf der Übertragungsnetzbetreiber; Übertragungsnetzbetreiber: Berlin, Germany, 2021.
- Referat Netzentwicklung Stromübertragungsnetz: Genehmigung des Szenariorahmens 2021–2035; Bundesnetzagentur für Elektrizität, Gas, Telekommunikation, Post und Eisenbahnen: Bonn, Germany, 2020.
- Guminski, A.; Fiedler, C.; Kigle, S.; Pellinger, C.; Dossow, P.; Ganz, K.; Jetter, F.; Limmer, T.; Murmann, A.; Rheinhard, J.; et al. eXtremOS Summary Report—Modeling Kit and Scenarios for Pathways Towards a Climate Neutral Europe; FfE: Munich, Germany, 2021. [Google Scholar]
- Strompreis für Haushalte. Berlin: BDEW, 2019. Available online: https://www.bdew.de/service/daten-und-grafiken/strompreis-fuer-haushalte/ (accessed on 15 December 2021).
- Ali, J.; Rehanullah, K.; Nasir, A.; Imran, M. Random Forests and Decision Trees. IJCSI Int. J. Comput. Sci.
**2012**, 9, 272. [Google Scholar] - Kern, T.; Dossow, P.; von Roon, S. Integrating Bidirectionally Chargeable Electric Vehicles into the Electricity Markets. Energies
**2020**, 13, 5812. [Google Scholar] [CrossRef] - Lastprofilverfahren—Lastprofile für Lieferanten der EEG-Werke; Stadtwerke Norderstedt: Norderstedt, Germany, 2016.

**Figure 9.**Distribution of prediction errors per described feature of all German municipalities (n = 11,977) with 50 typical hours neglecting values outside the range between the 1st and 99th percentiles.

**Figure 11.**Simulation time of municipalities by number of buildings for all generated municipalities excluding 1% of outliers (n = 11,838).

**Figure 12.**Detailed structure of the simulation model and the parts substituted by our emulation-model (gray).

**Figure 13.**Supply duration curves for nine municipalities of different sizes, including the ground truth, the model prediction, own consumption, and the total consumption within the community.

**Figure 14.**Supply duration curves for nine municipalities of different sizes, including the ground truth, the model prediction, and the total generation within the community.

**Figure 15.**Simulation sampling, training, testing, and benchmarking time of the supply and demand model with and without time series aggregation.

**Figure 16.**Model accuracy as a function of sampling size using TSA for training. The testing was conducted on the benchmark dataset.

**Figure 17.**Distribution of annual average P2P prices of all German municipalities (n = 11,977). Actual prices are shown in (

**a**) while weighted prices are displayed in (

**b**). The prices in (

**b**) have been weighted per hour according to the current demand (buy prices) or supply (sell prices).

**Figure 18.**Price duration curve of average P2P prices (per hour) of all municipalities in the population (n = 11,977).

**Figure 19.**Regional disparities in savings from leveraging P2P buy prices versus normal retail pricing.

**Figure 20.**Regional disparities in added value from leveraging P2P sell prices versus selling at the market.

Error Metric | Supply | Supply | Demand | Demand | |
---|---|---|---|---|---|

no TSA | TSA | no TSA | TSA | ||

Test | MAE | 15.196 | 19.787 | 19.904 | 19.979 |

R^{2} | 0.993 | 0.991 | 0.990 | 0.989 | |

Benchmark | MAE | 16.695 | 22.230 | 25.470 | 26.735 |

R^{2} | 0.990 | 0.988 | 0.996 | 0.994 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Bogensperger, A.J.; Fabel, Y.; Ferstl, J.
Accelerating Energy-Economic Simulation Models via Machine Learning-Based Emulation and Time Series Aggregation. *Energies* **2022**, *15*, 1239.
https://doi.org/10.3390/en15031239

**AMA Style**

Bogensperger AJ, Fabel Y, Ferstl J.
Accelerating Energy-Economic Simulation Models via Machine Learning-Based Emulation and Time Series Aggregation. *Energies*. 2022; 15(3):1239.
https://doi.org/10.3390/en15031239

**Chicago/Turabian Style**

Bogensperger, Alexander J., Yann Fabel, and Joachim Ferstl.
2022. "Accelerating Energy-Economic Simulation Models via Machine Learning-Based Emulation and Time Series Aggregation" *Energies* 15, no. 3: 1239.
https://doi.org/10.3390/en15031239