Next Article in Journal
Global Distribution and Natural Recombination of Hepatitis D Virus: Implication of Kyrgyzstan Emerging HDVs in the Clinical Outcomes
Next Article in Special Issue
Susceptibility to Resurgent COVID-19 Outbreaks Following Vaccine Rollouts: A Modeling Study
Previous Article in Journal
Multi-Substituted Quinolines as HIV-1 Integrase Allosteric Inhibitors
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

On the Parametrization of Epidemiologic Models—Lessons from Modelling COVID-19 Epidemic

Institute for Medical Informatics, Statistics and Epidemiology, University of Leipzig, Haertelstrasse 16-18, 04107 Leipzig, Germany
*
Authors to whom correspondence should be addressed.
Viruses 2022, 14(7), 1468; https://doi.org/10.3390/v14071468
Submission received: 11 May 2022 / Revised: 26 June 2022 / Accepted: 29 June 2022 / Published: 2 July 2022
(This article belongs to the Special Issue Mathematical Modeling of the COVID-19 Pandemic)

Abstract

:
Numerous prediction models of SARS-CoV-2 pandemic were proposed in the past. Unknown parameters of these models are often estimated based on observational data. However, lag in case-reporting, changing testing policy or incompleteness of data lead to biased estimates. Moreover, parametrization is time-dependent due to changing age-structures, emerging virus variants, non-pharmaceutical interventions, and vaccination programs. To cover these aspects, we propose a principled approach to parametrize a SIR-type epidemiologic model by embedding it as a hidden layer into an input-output non-linear dynamical system (IO-NLDS). Observable data are coupled to hidden states of the model by appropriate data models considering possible biases of the data. This includes data issues such as known delays or biases in reporting. We estimate model parameters including their time-dependence by a Bayesian knowledge synthesis process considering parameter ranges derived from external studies as prior information. We applied this approach on a specific SIR-type model and data of Germany and Saxony demonstrating good prediction performances. Our approach can estimate and compare the relative effectiveness of non-pharmaceutical interventions and provide scenarios of the future course of the epidemic under specified conditions. It can be translated to other data sets, i.e., other countries and other SIR-type models.

1. Introduction

Predicting the spread of an infectious disease is a pressing need as demonstrated for the present SARS-CoV-2 pandemic. Due to the worldwide high disease burden, a plethora of mathematical epidemiologic models was proposed. This includes auto-regressive time series methods, Bayesian techniques, and application of deep learning methods, but also mechanistic models and hybrid models combining some of these approaches (see [1] for a review). Among mechanistic models, based either on agents [2,3,4] or on compartments [5], the most commonly proposed and published model type is the classical SIR (S = susceptible, I = infected, R = recovered) type compartment model, which was presented with different modifications often considering further aspects and details such as disease states, age structure, contact patterns, and intervention effects. Major aims of these models are to predict (1) the dynamics of infected subjects, (2) requirements of medical resources during the course of the epidemic, or (3) the effectiveness of non-pharmaceutical intervention programs (NPI) [6,7]. Examples for models addressing these aims are a SECIR model (E = exposed, C = cases) proposed by Khailaie et al. [8], a SEIR type model proposed by Barbarossa et al. [9], models from The Robert-Koch institute [10], and from Dehning et al. [11].
A good prediction performance does not only depend on the precise structure of the model but on its parametrization. This, however, is a non-trivial and often underestimated task due to the following issues applicable to other infectious diseases as well: key epidemiologic parameters are often unknown or only known within ranges. Therefore, parametrization based on observational data is a common approach. However, reported official data bases are heterogeneous and often biased due to (1) lag in reporting of cases/events [12], (2) changing testing policy either due to limited testing capacities, which might depend on the pandemic situation itself or by changing risk profiles of people to be tested (e.g., defined risk groups, dependence on symptoms, degree of prophylactic testing) [13], and (3) incompleteness of data [14]. Moreover, parametrization depends on further epidemiologic issues to be considered, comprising (1) changing age-structure of the infected population with impact on symptomatology, hospital or intensive care requirements and mortality, (2) spatial heterogeneity of the spread of the disease driven by local conditions and outbreaks, (3) new pathogen variants becoming prevalent, (4) non-pharmaceutical interventions continuously updated in response to the pandemic situation, and finally, (5) the progress of vaccination programs and its effectiveness.
Due to these interacting complexities, it is close to impossible to construct a fully mechanistic model covering all these aspects in parallel. Therefore, we here propose a framework of epidemiologic model parametrization, which accounts for these issues in a more phenomenological, data-driven way applicable even for limited or biased data resources.
In detail, we here propose to integrate a mechanistic epidemiologic model as a hidden layer into an input-output non-linear dynamical system (IO-NLDS), i.e., the true epidemiologic dynamics cannot be directly observed. This allows distinguishing between features explicitly modelled (in our case different virus variants, vaccination) and changing factors of the system which are difficult to model mechanistically (in our case changes of contacts, e.g., due to non-pharmaceutical interventions or changing contact behavior, changing age-structure of the infected population and changes in testing policy, in the following abbreviated as NPI/behavior). These factors are imposed as external inputs of the system.
We then estimate parameters by a knowledge synthesis process considering prior information of parameter ranges derived from different external studies and other available data resources such as public data. We are thus going beyond previous modeling approaches that only used point estimates for parameters [15,16]. Specifically, we use Bayesian inference for the parameter estimation, which could also be time-dependent. We analyze the structure of available public data in detail and translate it to model outputs linked by an appropriate data model to the hidden states of the IO-NLDS, i.e., the epidemiologic model. We demonstrate this approach on an example epidemiologic model of SECIR type for SARS-CoV-2 and data of Germany and Saxony, but our method can be translated to other countries, other models or even other infectious disease contexts.

2. Materials and Methods

2.1. General Approach

We consider input-output non-linear dynamical systems (IO-NLDS) originally proposed as time-discrete alternatives to physiological pharmacokinetic and –dynamics differential equations models [17]. This class of models couples a set of input parameters such as external influences and factors with a set of output parameters, i.e., observations by a hidden model structure to be learned (named core model in the following). This coupling is not necessarily fully deterministic, i.e., data are not required to represent directly state parameters of the model. This represents a major feature of our approach in order to account for different types of biases in available observational time series data.
We here demonstrate our approach by using an epidemiological model as core of the IO-NLDS. Non-pharmaceutical interventions, changes of testing policy, age distributions and severity of disease courses were phenomenologically modelled by external control parameters imposed on the epidemiologic model via the input layer of the IO-NLDS. Random influxes of infected subjects e.g., by travelling activities or outbreaks are also considered by this approach. Number of reported infections, intensive care (IC) cases, and deaths are considered as output parameters not directly representing the hidden states of the model due to several data issues including reporting delays. The model is then fitted to data by a full information approach, i.e., all data points were evaluated by a suitable likelihood function.
The single steps of this process are explained in detail below.
Assumptions of the core model
We adapted a standard SECIR model (SECIR = susceptible, exposed, cases, infectious, recovered) for pandemic spread. We introduced an asymptomatic compartment in order to account for infected patients, which do not have symptoms, a common condition of SARS-CoV-2 infection. A compartment of patients requiring intensive care (IC) was added to model respective requirements and we distinguished between deceased and recovered patients.
We subdivided most of the compartments into three sub-compartments with first order transitions to model time delays. Transition rates between sub-compartments are the same for each respective compartments for the sake of parsimony. This approach is extensively used in pharmacological models [10]; it is equivalent to a Gamma-distributed transit time [11]. To allow for two concurrent virus variants with differing properties, compartments of asymptomatic and symptomatic infected subjects are duplicated. This allows us, for example, to simulate the take-over of the more infectious B.1.1.7 (Alpha), and later, B1.617.2 (Delta) variant observed e.g., in all European countries [12].
The general scheme of the IO-NLDS system is shown in Figure 1. We make the following assumptions:
  • The input layer consists of external modifiers influencing (1) reporting policy (e.g., changing testing policy), (2) rates of infections (affected by non-pharmaceutical interventions, age structure, influx of cases), and (3) risks of severe disease conditions such as IC requirements and deaths, also depending on the changing age structure of infected subjects.
  • The output layer of observable data is linked to the hidden layers of the core model by specific data models (see later).
  • Susceptible, non-infected people (Sc): We assume that 100% of the population is susceptible to infection at the beginning of the epidemic.
  • The latent state E comprises infected but non-infectious people.
  • The asymptomatic infected state IA has three sub-compartments (I_(A,1), I_(A,2) and I_(A,3)). From I_(A,1), transitions to the symptomatic state or the second asymptomatic state are possible. From I_(A,2), only transitions to I_(A,3) and then to the recovered state R are assumed.
  • The symptomatic infected state IS is also divided into three compartments (I_(S,1), I_(S,2), and I_(S,3)). The sub-compartment I_(S,1) comprises an efflux toward the sub-compartment C_1 representing deteriorations toward critical disease states. Otherwise, the patient transits to I_(S,2). From I_(S,2), a patient can either die representing deaths without prior intensive care or transit to I_(S,3). Finally, the efflux of I_(S,3) flows into R representing resolved disease courses.
  • Both cases I_A and I_S contribute to new infections but with different rates to account for differences in infectivity and quarantine probabilities.
  • The compartment C represents critical disease states requiring intensive care. We assume that these patients are not infectious due to isolation. Again, the compartment is divided into three sub-compartments, C_1, C_2, and C_3. In C_1, a patient can either die or transit to C_2, C_3, and finally, R.
  • Patients on the recovered stage R are assumed to be immune against re-infections.
  • We duplicate the compartments E, I_(A,1),…, I_(A,3), I_(S,1),…, I_(S,3) to account for two concurrent virus variants. We assume different infectivities for the two variants. All other parameters are assumed equal. No co-infections are assumed.
These assumptions are translated into a difference equation system (see Appendix A). Model compartments and their properties are explained in Table 1.
All model parameters of the model are described in Table 2. Complete dynamics of the epidemic in Germany is shown in Figure 2 and Figure 3.

2.2. Input Layer

The input layer represents external factors acting at the SECIR model, effectively changing its parameters [42]. We define step functions b1 and b2 as time-dependent input parameters modifying the rate of infections caused by asymptomatic, respectively symptomatic subjects. To identify dates of change, we used a data-driven approach based on a Bayesian Information Criterion informed by changes in non-pharmaceutical interventions for Germany based on Government decisions, changing testing policies as well as events with impact on epidemiological dynamics such as holidays or sudden outbreaks. Details can be found in Appendix B.
We also accounted for changes in the probabilities of critical courses and mortality, which can be explained by changes in testing policies covering asymptomatic cases to a different extent (for example symptomatic testing only vs. introduction of screening tests, e.g., rapid antigen tests), respectively shifts in the age-distribution of patients or changes in patient care efficacy (new pharmaceutical treatment, overstretched medical resources). Again, this is implemented by step functions pcrit, respectively pdeath. Number of steps are determined on the basis of a Bayesian Information Criterion. Details can be found in the Appendix B as well as in Table A5 from Appendix I and Table A8 from Appendix J. The parameter PS,M represents the percentage of reported infected symptomatic subjects in relation to all symptomatic subjects. This value is assumed to be constant (50%) in the present version of the model. We describe the parameters defining the input layer in Table 3.

2.3. Output Layer and Data

We fit our model to time series data of reported numbers of infections IS,M, deaths DM, and occupation of ICU beds CM representing the output layer of our IO-NLDS model. Data source of infections and deaths were official reports of the Robert-Koch-Institute (RKI) in between 4 March 2020 and 29 March 2021. Number of critical cases were retrieved from the German Interdisciplinary Association of Intensive and Emergency Medicine (Deutsche Interdisziplinäre Vereinigung für Intensiv- und Notfallmedizin e.V.—DIVI) in between 25 March 2020 and 29 March 2021. Time points in proximity to Christmas and the turn of the year (19 December 2020 to 19 January 2021) were heavily biased and therefore omitted during parameter fitting.
However, also for the considered time intervals several sources of bias need to be considered. We handled these issues as explained in the following:
Infected cases: We first smoothed reported numbers of infections with a sliding window of seven days centered on the time point of interest to control for strong weekly periodicity. We assume that these numbers correspond to a certain percentage PS,M of symptomatic patients. This is justified by the fact that the majority of reported infected cases develop symptoms (about 85% according to the RKI [43]), but there is also a large amount of asymptomatic cases (approximately 55–85% of infections [37,38,39]. In the present model, we assume PS,M as constant. The exact equation linking states of the SECIR model with the measured numbers of infected subjects IS,M can be found in Appendix B and Appendix C Equation (A7). We further accounted for delays in the reporting of case numbers by introducing a log-normally distributed delay time as explained in Appendix C.
Critical cases: The number of critical COVID-19 cases (DIVI reported ICU) is available since end of March 2020 [44]. We assumed that these data are complete since 16 April 2020 when reporting became mandatory by law in Germany. Earlier data were up-scaled from the number of reporting hospitals to the number of ICU-beds of all hospitals according to the reported ICU capacity available for 2018. We coupled the sum of the critical sub-compartments Ci (i = 1,2,3) to these numbers directly.
Deaths: Deaths are reported by the RKI but daily reports do not reflect true death dates, which needs to be accounted for. Available daily death data of the RKI are retrospectively updated with a delay between true death date and reported date (death report delay—DRD). We assume that the DRD is normally distributed with an average of 7.14 days and a standard deviation of 4 days as reported by Delgado et al. [45]. Details can be found in Appendix C.
Occurrence of B.1.1.7 variant: In January 2021, the variant B.1.1.7 became endemic in Germany and quickly replaced all other variants. Onset of this development was modeled by an instantaneous influx of 5% of newly infected subjects into the EMu compartment on 26 January 2021 estimated from published data [46].

2.4. Parametrization

We carefully searched the literature to establish ranges for our model parameters. These ranges are used as prior constraints during parametrization of our model (Table 2). Justification of prior values is provided in Appendix H. Parameters are then derived by fitting the predictions of the model to reported data of infected subjects, ICU occupation, and deaths using the link functions of model and data explained in the previous section. This is achieved via likelihood optimization. Likelihood is constructed using the same principles as reported [47]. In short, the likelihood consists of three major parts, namely the likelihood of deviations from prior values, the likelihood of residual deviations from the data, and a penalty term to ensure that model parameters are within the prescribed ranges, as explained in Appendix D. We follow a full-information approach intended to use all data collected during the epidemic as explained in Appendix E, Appendix F and Appendix G. As a result, our model fits well to the complete dynamics of the epidemic in Germany in the above mentioned time period (Figure 2 and Figure 3).
To ensure identifiability of parameters, we checked a number of parsimony assumptions. For example, we assumed that the dynamical infection intensities of asymptomatic ( b 1 · r 1 t ) and symptomatic subjects ( b 2 · r 2 t ) are proportional with factor r b 1 , 2 . We also determined Bayesian Information criteria (BIC) for different partitioning numbers of the external jump functions ( N c r i t and N d e a t h ) to keep these as small as possible. Details can be found in Appendix B.
Likelihood optimization is achieved using a stochastic approximation of an estimation-maximization algorithm (SAEM) [48]. The algorithm is based on a stochastic integration of marginal probabilities without using likelihood approximations such as linearization or quadrature approximation or sigma-point filtering [17].
Confidence intervals of model predictions are derived by Markov-Chain-Monte-Carlo simulations, i.e., alternative parameter settings were sampled from the parameter space around the optimal solution (Appendix B, Appendix F and Appendix G). We use these parameter sets to simulate alternative epidemic dynamics. This resulted in a distribution of model predictions from which empirical confidence intervals are derived.

2.5. Implementation

The model and respective parameter estimations are implemented in the statistical software package R from which external publicly available functions are called. The model’s equation solver is implemented as C++ routine and called from R code using the Rcpp package. The code and data for simulation of the output layers using the reported parameter settings will be made available via our Leipzig Health Atlas: (https://www.health-atlas.de/models/40, accessed on 26 June 2022) and GitHub (https://github.com/GenStatLeipzig/LeipzigIMISE-SECIR, accessed on 26 June 2022) [49].

3. Results

3.1. Explanation of Epidemiologic Dynamics

We used the full data set to explain the course of infections, ICU occupations, and deaths between 4 March 2020 and 29 March 2021 in Germany. A total of three parameters were assumed time-dependently, namely Infectivity b1 and the probability of a critical disease course (pcrit) and death (pdeath). We identified nine fixed and 19 empirically identified time points of NPI/behavioral changes (Table A1 from Appendix I). Regarding pcrit and pdeath, we identified 18 respectively 19 time steps (See Table A2 and Table A3 from Appendix I and Table A7 from Appendix J). Throughout the epidemic, we observed a good agreement of our model and incident (Figure 2) and cumulative data (Figure 3). Corresponding residual errors are provided for all observables (Table A4 from Appendix I). As shown in Table A2 from Appendix I, we estimated 14 static and three dynamically changing parameters using 1170 data points (390 daily measurements of registered cases, registered deaths and ICU occupancy) for Saxony as well as for Germany.
Figure 2. Agreement of model and incident data. We show incident infections, deaths, and daily ICU occupancy during the course of the epidemic in Germany in between 4 March 2020 and 29 March 2021. Comparison of IO-NLDS model (magenta curve) and data (thin grey curves = raw data, solid black curve = data averaged by sliding window) is provided in the upper column. A good agreement is observed (shaded area = prediction uncertainty, vertical lines = changes in NPI/contact behavior). The middle row represents the corresponding input layer, i.e., the estimated time course of the time-dependent input parameters, namely infectivity and probabilities of critical disease course and death. Time steps correspond to the lines of changing NPI/contact behavior as displayed in the upper row. In the lower row, we present percentages of B.1.1.7 among infected subjects (first figure), subjects older than 80 years among infected corresponding to high death tolls (second), and subjects in the age categories 35–59, respectively 60–79 among critical cases (last figure of last row).
Figure 2. Agreement of model and incident data. We show incident infections, deaths, and daily ICU occupancy during the course of the epidemic in Germany in between 4 March 2020 and 29 March 2021. Comparison of IO-NLDS model (magenta curve) and data (thin grey curves = raw data, solid black curve = data averaged by sliding window) is provided in the upper column. A good agreement is observed (shaded area = prediction uncertainty, vertical lines = changes in NPI/contact behavior). The middle row represents the corresponding input layer, i.e., the estimated time course of the time-dependent input parameters, namely infectivity and probabilities of critical disease course and death. Time steps correspond to the lines of changing NPI/contact behavior as displayed in the upper row. In the lower row, we present percentages of B.1.1.7 among infected subjects (first figure), subjects older than 80 years among infected corresponding to high death tolls (second), and subjects in the age categories 35–59, respectively 60–79 among critical cases (last figure of last row).
Viruses 14 01468 g002
Figure 3. Agreement of model and cumulative data. We show cumulative infections and deaths during the course of the epidemic in Germany in between 4 March 2020 and 29 March 2021. Comparison of IO-NLDS model (magenta curve) and data (solid black curve) is provided. A good agreement is observed (shaded area = prediction uncertainty, vertical lines = changes in NPIs/contact behavior).
Figure 3. Agreement of model and cumulative data. We show cumulative infections and deaths during the course of the epidemic in Germany in between 4 March 2020 and 29 March 2021. Comparison of IO-NLDS model (magenta curve) and data (solid black curve) is provided. A good agreement is observed (shaded area = prediction uncertainty, vertical lines = changes in NPIs/contact behavior).
Viruses 14 01468 g003

3.2. Parameter Estimates and Identifiability

Parameter estimates of the SECIR model are presented in Table 2, while those required to define the input layer are presented in Table 3 and Tables A1 and A3 from Appendix I. For those parameters for which we used prior information for fitting purposes, we compared the respective expected posteriors with their best priors (see Figure 4). Statistics are provided in Table A5 from Appendix I. No significant deviations between expected values of posteriors and priors were detected. All relative errors of parameters of the SECIR model are smaller than 10% demonstrating excellent identifiability of all epidemiologic parameters. As expected, identifiability of the external control functions is reduced. Largest standard errors of steps are in the order of 70% still demonstrating reasonable identifiability (Table A3 from Appendix I).

3.3. Plausibilization of Estimated Step Functions of Infectivity

We estimated the infectivity as an empirical step function through the course of the epidemic. This step function should also roughly reflect NPI effectivity. We therefore compared our infectivity step function with the Governmental stringency index of NPI as estimated on the basis of Hale et al. [50]. Results are displayed at Figure 5 and revealed a reasonable agreement.

3.4. Model Predictions

We regularly used our model to make predictions regarding the future course of the epidemic. Predictions were specifically made for the Free State of Saxony, a federal state of Germany and were published at the Leipzig Health Atlas [49]. We here present comparisons of our predictions with the actual course of the epidemic for two scenarios to demonstrate utility of our approach. Parameter values for Saxony were obtained in the same way as for Germany restricting available data of infected subjects, ICU cases, and deaths to this state. Estimated parameter values are presented in Table A6, Table A7 and Table A8 from Appendix J.
While Saxony was almost spared from the first wave of SARS-CoV-2 in Germany, the second wave hit the country particularly hard resulting in the highest relative death toll of all German states (1 out of 400 inhabitants of Saxony died from COVID-19 during the second and the immediately following B.1.1.7-driven third wave). The second wave was on its peak in the middle of December 2020. A hard lock-down was initiated at this time including closure of schools, prohibition of all team-based leisure activities, and night-time curfew. We were asked by the government to estimate the length of lock-down required to break the second wave. Stringency of lock-down was comparable to the first wave. Thus, we simulated four scenarios: an optimistic assumption of a lock-down efficacy of 60% reduction in infectivity, a more realistic scenario with 40% reduction, a pessimistic assumption of only 20% lock-down efficacy, and finally, 0% reduction (no lock-down) as control scenario. Results are shown in Figure 6 and revealed a good agreement of our prediction with the actual course for the 40% scenario considered likely.
At the beginning of February 2021, the second wave was broken in Saxony and first relaxations of NPIs were conducted. At this time, the more virulent B.1.1.7 variant became endemic in Germany. At 14 February, the true percentage of the B.1.1.7 variant was unknown due to lack of sequencing capacities. Moreover, there were uncertainties with respect to the increase in infectivity by the B.1.1.7 variant. We therefore simulated three scenarios (optimistic: 10% initial proportion of B.1.1.7, infectivity increased by factor 1.7, expected: 20% initial proportion, 1.8-times increase in infectivity, pessimistic: 30% initial proportion, 2-times increase in infectivity). Results are shown in Figure 7. The actual course of the epidemic was close to the pessimistic scenario, i.e., the second wave was directly followed by a third wave due to the B.1.1.7 variant. Indeed, later data revealed that the proportion of B.1.1.7 was already close to 30% (pessimistic assumption) at the time the simulation was performed. Moreover, our model correctly predicted the variant replacement by B.1.1.7.

4. Discussion

In this paper, we propose a method of parametrization of COVID-19 epidemiologic models and applied it to an extended SECIR-type model to explain the course of the epidemic in Germany and one of its federal state, the Free State of Saxony. Moreover, we demonstrated how the model can be used to make relevant predictions, which could be validated on the basis of subsequent observational data.
A key idea of our approach is the embedding of differential equations-based epidemic modelling into an input-output dynamical system (IO-NLDs). This has two major advantages. First, the approach allows combining explicit mechanistic models of epidemic spread and phenomenological considerations of external impacts on model parameters via the input layer. This allows parametrizing models of different complexity. For example, in our model we non-explicitly considered the effect of age structures of the diseased population by time-dependent input parameters such as probabilities of critical disease courses and deaths. This could easily be replaced for example by age-structured models. We believe that such a combined empirical/mechanistic approach is well suitable to address the complexity of COVID-19 epidemic dynamics for which it is impossible to consider all relevant mechanisms explicitly and in parallel.
The second major advantage of our approach is that we assumed a non-direct link between state parameters of the embedded SECIR model and observables. This allows interposing a data model considering known biases of the available data resources. We aimed at identifying relevant bias sources as far as possible and considered them in our proposed data models. However, these data models could be subjected to changes in the future for example if better data of COVID-19-related death will be released. Improved data models could be easily integrated into our framework.
Note that the IO-NLDS implementation translates the embedded differential equations model to a discrete scale (i.e., days in our case), which however appears to be sufficient for describing an epidemic.
We also want to note that the SECIR model used here is by far neither unique nor the most comprehensive one. For example, The Robert-Koch institute developed a model for the purpose of estimating the effect of different vaccination strategies which could easily be included into our SECIR-type models [54]. Although integration of differential equations-based models into our IO-NLDS context is more straightforward, our approach is also applicable to agent-based models. In general, the aspect of parameter estimation of such models is underdeveloped in view of the highly biased data resources used and to our knowledge, no generic concept was proposed so far.
Based on our IO-NLDS formulation and data models, we parametrized our model on the basis of data of infection numbers, critical cases, and deaths available for Germany and Saxony. Here, we chose a full-information approach considering all data in between start of the epidemic 4 March 2020 to 29 March 2021. We also applied a Bayesian learning process by considering other studies to inform model parameter’s settings. Thus, we combine mechanistic model assumptions with results from other studies and observational data. This approach is very popular in pharmacology [55] but despite its importance it is yet rarely applied in epidemiology [11]. It resulted in a complex likelihood function, which is optimized on the basis of Markov-Chain Monte Carlo (MCMC) algorithms, as we described in Appendix B and Appendix F. If the likelihood has a unique maximum, most of the samples eventually accumulate in its vicinity after a certain number of “burn-in” steps. This allows an effective MCMC search of the best parameter estimates as well as approximations of their standard errors (standard deviations of the sample) and the degree of overfitting. However, if parameters are interdependent, MCMC algorithm samples manifolds of alternative solutions, resulting in large standard errors of the overfitted parameters. We successfully addressed this issue by a modified version of Maire’s algorithm [56]. Central to this approach is the idea that the proposal distribution adapts to the target by locally adding a mixture component when the discrepancy between the proposal mixture and the target is deemed to be too large. In other words, this algorithm samples multidimensional parameters sets, approximating it as a mixture of multivariate Gaussian distribution. Such approaches enable adequate sampling of model parameters and detection of overfitting as well as of multiple local maxima of the likelihood. Our results revealed small standard errors indicating lack of overfitting, see Table A1 and Table A3 from Appendix I, Table A6 and Table A7 from Appendix J. We also applied rigorous information criteria to limit the number of steps of our input functions. As a consequence, it was possible to identify both, the fixed parameters of the SECIR model and the time-variable input functions representing changing NPI/contact behavior and age-structures.
Model parametrization resulted in a good and unbiased fit of data for the period considered for Germany. Fixed parameter values of the SECIR model did not significantly deviated from their prior values if available. It required 18 respectively 19 steps of changes of the probabilities to develop critical stage and to die respectively. A total of 13 intensification and 15 relaxation events were necessary to describe the epidemic dynamics over the time course of observations. Estimated infectivity roughly correlated with the Governmental Stringency Index [51]. We regularly contributed forecasts of our model to the German forecast Hub [57].
We also demonstrated utility of our model by several mid-term simulations of scenarios of epidemic development in Saxony, a federal state of Germany. We could show that predictions of reported infections were in the range of later observations for scenarios considered likely.
As future extensions and improvements of our model, we will consider stochastic effects on a daily scale, for example to model random influxes of cases or to model random extinctions of infection chains. These effects are relevant to be considered in times of low incidence numbers such as those observed in Germany in the summers 2020 and 2021. Our IO-NLDS framework is well suited to implement such extensions [17].
In future versions of our model, we will also include age-structures and implement a vaccination and waning model in analogy to other research groups. In the current version of the model, we assumed a constant proportion of symptomatic patients reported as infected. This does not consider for example changing testing policies (i.e., symptomatic vs. prophylactic testing). We plan to refine our model in this regard in the future. Finally, we will consider the Delta and Omicron variants emerging in 2021 [53] in the next update of our SECIR model.
In summary, the primary focus of the paper is an adequate parametrization of epidemiological models on the basis of complex, possibly biased data, as well as its coupling with structurally unknown dynamical external influences. This approach allows for a clear separation of mechanistic model compartments from random or time-dependent non-mechanistic influences and biases in the data. We believe that this approach is useful not only for the parametrization of the SECIR model presented here but also for other epidemiologic models including other disease contexts and data structures.

Author Contributions

Conceptualization (ideas; formulation or evolution of overarching research goals and aims): H.K. and M.S.; data curation (management activities to annotate (produce metadata), scrub data and maintain research data (including software code, where it is necessary for interpreting the data itself) for initial use and later reuse): H.K.; formal analysis (application of statistical, mathematical, computational, or other formal techniques to analyze or synthesize study data): Y.K.; funding acquisition (acquisition of the financial support for the project leading to this publication): M.S.; investigation (conducting a research and investigation process, specifically performing the experiments, or data/evidence collection): Y.K., H.K. and M.S.; methodology (development or design of methodology; creation of models): Y.K., M.S. and H.K.; project administration (management and coordination responsibility for the research activity planning and execution): M.S.; resources (provision of study materials, reagents, materials, patients, laboratory samples, animals, instrumentation, computing resources, or other analysis tools): H.K.; software (programming, software development; designing computer programs; implementation of the computer code and supporting algorithms; testing of existing code components): Y.K. and H.K.; supervision (oversight and leadership responsibility for the research activity planning and execution, including mentorship external to the core team): M.S.; validation (verification, whether as a part of the activity or separate, of the overall replication/reproducibility of results/experiments and other research outputs): H.K. and M.S.; visualization (preparation, creation and/or presentation of the published work, specifically visualization/data presentation): H.K., writing—original draft preparation (creation and/or presentation of the published work, specifically writing the initial draft (including substantive translation)): Y.K., H.K. and M.S.; writing—review and editing (preparation, creation and/or presentation of the published work by those from the original research group, specifically critical review, commentary or revision—including pre- or post-publication stages): Y.K., H.K. and M.S. All authors have read and agreed to the published version of the manuscript.

Funding

This project was funded in the framework of the project SaxoCOV (Saxonian COVID-19 Research Consortium). SaxoCOV was financed by the Free State of Saxony. Presentation of data, model results and simulations were funded by the NFDI4Health Task Force COVID-19 (www.nfdi4health.de/task-force-covid-19-2, accessed on 20 June 2022) within the framework of a DFG-project (LO-342/17-1). Epidemiological modeling was also supported by the German Federal Ministry of Education and Research (BMBF) within the framework of the e:Med line of funding (CAPSyS, grant number 01ZX1304A) and the project PROGNOSIS (grant number #031L0296A).

Institutional Review Board Statement

Ethical review and approval were waived for this study as only published data from official sources was used (see manuscript for detailed description of data sources).

Informed Consent Statement

Patient consent was waived for this study as only published data from official sources was used (see manuscript for detailed description of data sources).

Data Availability Statement

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Equations of the SECIR Model 

We here present the equations of the SECIR model serving as hidden layer of our input-output non-linear dynamical system (IO-NLDS). To fit in this context, the ordinary differential equations of the SECIR model are approximated by a difference equation system describing compartment changes at single days, i.e., time-steps Δ t equals one day. Compartments of the model are explained in Table 1 of the main paper. Parameters are explained in Table 2 of the main paper.
Δ S c Δ t = i n f l u x I n f l u x E I n f l u x E M u Δ E Δ t = i n f l u x + I n f l u x E r 3 · E                         Δ I A , 1 Δ t = r 3 · E X r 4 , b , r 4 , p s y m p · r 4 , b + 1 X r 4 , b , r 4 , p s y m p · r 4 · I A , 1             Δ I A , 2 Δ t = 1 X r 4 , b , r 4 , p s y m p · r 4 · I A , 1 r 4 · I A , 2       Δ I A , 3 Δ t = r 4 · I A , 2 r 4 · I A , 3   Δ I S , 1 Δ t = X r 4 , b , r 4 , b , p s y m p · r 4 , b · I A , 1 X r 6 , r 5 , p c r i t · r 6 + 1 X r 6 , r 5 , p c r i t · r 5 · I S , 1     Δ I S , 2 Δ t = 1 X r 6 , r 5 , p c r i t · r 5 · I S , 1 r 5 · 1 p d e a t h , S · I S , 2 p d e a t h , S ·   r 8 · I S , 2     Δ I S , 3 Δ t = 1 p d e a t h , S · r 5 · I S , 2 r 5 · I S , 3             Δ C 1 Δ t = X r 6 , r 5 , p c r i t · r 6 · I S , 1 + I S , 1 M u X r 8 , r 7 , p d e a t h · r 8 + 1 X r 8 , r 7 , p d e a t h · r 7 · C 1 Δ C 2 Δ t = 1 X r 8 , r 7 , p d e a t h · r 7 · C 1 r 7 · C 2     Δ C 3 Δ t = r 7 · C 2 r 7 · C 3         Δ R Δ t = r 5 · I S , 3 + I S , 3 M u + r 7 · C 3 + r 4 · I A , 3 + I A , 3 M u                       Δ D Δ t = X r 8 , r 7 , p d e a t h · r 8 · C 1 + p d e a t h , S ·   r 8 · I S , 2 + I S , 2 M u                  
With the abbreviations
I n f l u x E = r 1 · b 1 · S c · I A , 1 + I A , 2 + I A , 3 + r 2 · b 2 · S c · I S , 1 + I S , 2 + I S , 3                             I n f l u x E M u = M u r r 1 · b 1 · S c · I A , 1 M u + I A , 2 M u + I A , 3 M u + r 2 · b 2 · S c · I S , 1 M u + I S , 2 M u + I S , 3 M u        
At this, we assume that asymptomatic and symptomatic compartments have different time-dependent infectivity (r1 * b1, r2 * b2), with b1 and b2 later defined on the basis of time-dependent NPI/behavioral changes and, for parsimony, b1 = b2. Hence, the ratio of the products (r1 * b1) and (r2 * b2) is assumed constant. Factor of increased infectivity of new virus variant mur: this factor is multiplied to r1 and r2 reflecting higher infectivity of the B.1.1.7 variant compared to the previous variants. The superscript Mu denotes new virus variants.
The functions X represent decisions regarding the further disease course defined below.
Analogously, the equations for the concurrent variant compartments are as follows:
Δ E M u Δ t = I n f l u x E M u r 3 · E M u Δ I A , 1 M u Δ t = r 3 · E M u X r 4 , b , r 4 , p s y m p · r 4 , b + 1 X r 4 , b , r 4 , p s y m p · r 4 · I A , 1 M u         Δ I A , 2 M u Δ t = 1 X r 4 , b , r 4 , p s y m p · r 4 · I A , 1 M u r 4 · I A , 2 M u                       Δ I A , 3 M u Δ t = r 4 · I A , 2 M u r 4 · I A , 3 M u                 Δ I S , 1 M u Δ t = X r 4 , b , r 4 , b , p s y m p · r 4 , b · I A , 1 M u X r 6 , r 5 , p c r i t · r 6 + 1 X r 6 , r 5 , p c r i t · r 5 · I S , 1 M u Δ I S , 2 M u Δ t = 1 X r 6 , r 5 , p c r i t · r 5 · I S , 1 M u r 5 · 1 p d e a t h , S · I S , 2 M u p d e a t h , S ·   r 8 · I S , 2 M u         Δ I S , 3 M u Δ t = 1 p d e a t h , S · r 5 · I S , 2 M u r 5 · I S , 3 M u      
In some of the compartments, there is a decision between recovery or deterioration of the disease course. For example, patients in C1 can either die or recover with different rates ri and rj. To model this decision, we split the number of patients in the compartment by a respective probability p and determine the decision factor X as follows.
r i   X r j   1 X = p 1 p
Thus
X r i , r j , p = r j p r i · 1 p + r j · p
To start the epidemic in Germany, we assumed an entry of infected cases by a linearly decreasing function starting at 4 March 2020 and becoming zero at 10 March 2020. Occurrence of the B.1.1.7 variant was initialized by a single influx to E M u , I A , 1 M u , I A , 2 M u , I A , 3 M u , I S , 1 M u , I S , 2 M u , I S , 3 M u at the 26 February 2021. We assume that the ratio between these influxes are the same as those for the normal variant compartments at this day. The sum of E M u , I A , 1 M u , I A , 2 M u , I A , 3 M u , I S , 1 M u , I S , 2 M u , I S , 3 M u at 26 February 2021 was fitted to be 5.3% of the corresponding sum of the normal compartments at the same day.

Appendix B. Input Layer 

The input layer of our IO-NLDS is designed to describe the effects of non-pharmaceutical interventions (NPI) and other impacts on infectivity such as behavioral changes, changes in age-structure, testing policy, seasonal effects, or larger outbreaks (abbreviated as NPI/contact behaviour). Since these effects typically affect contact matrices in different ways, we model this phenomenologically by time-dependent reductions or increases of infection rates caused by symptomatic and asymptomatic subjects as explained in Equation (A1).
We make the following assumptions for non-pharmaceutical interventions:
1. We introduce the relative infectivity function b t , which changes according to NPI/contact behavior modifications. This is modelled by a linear increase (in case of relaxation) or decrease (in case of tightening) within a fixed time D e l t r of two days. Otherwise, b t is constant. We denote T t r , s s = 1 N t r as the time points with changes in non-pharmaceutical interventions with N t r the total number of time points with changes. We collected dates of changing non-pharmaceutical intervention measures for Germany based on government decisions, changing testing policies as well as events with impact on epidemiological dynamics such as holidays and sudden outbreaks(such as thin peak of new infections in June affected mostly workers of the meat industry). We also assumed additional time points with changes determined by BIC.
Again, for the sake of parsimony, we assume that the relative infection intensities of asymptomatic ( b 1 t ) and symptomatic subjects ( b 2 t ) are the same, hence, respective proportionality r b 1 , 2 is constant and estimated during model fitting.
Thus, b t is defined as follows:
b t = { b t r , s 1 , t T t r , s 1 + D e l t r ,   T t r , s b t r , s 1 D e l t r · D e l t r t + T t r , s + b t r , s D e l t r · t T t r , s , t T t r , s , T t r , s + D e l t r b t r , s , t T t r , s + D e l t r ,   T t r , s + 1 b t r , 0 = 1 b 1 t = b t b 2 t = b 1 t
The time point t = 0 corresponds to the 3 March 2020.
2. Likewise, rates toward critical disease states and deaths are also assumed to vary through the course of the epidemic due to changes in testing policy resulting in different percentages of unreported cases and asymptomatic subjects, changes in age distribution of infected subjects, improvement of patient care due to new treatment options and due to possible over-stretched medical resources (not the case in Germany but other countries). In our model, this is also accounted for phenomenologically by assuming the probabilities p c r i t and p d e a t h as time-dependent input parameters.
We assume that both functions are step functions:
p c r i t t = p c r i t , 0 · j = 0 N c r i t 1 α c r i t , j · χ T p c r i t , j , T p c r i t , j + 1 t
where T p c r i t , j j = 1 N c r i t are empirical dates and α c r i t , j j = 1 N c r i t the respective relative changes of p c r i t . Both, T p c r i t , j j = 1 N c r i t as well as α c r i t , j j = 1 N c r i t are parameters to be estimated. The initial value of p c r i t is p c r i t , 0 . Functions χ t j , t j + 1 t are indicator functions being 1 in the interval T p c r i t , j , T p c r i t , j + 1 and 0 else.
The step functions for p d e a t h t and p d e a t h , S t are defined analogously:
p d e a t h t = p d e a t h , 0 · j = 0 N d e a t h 1 α d e a t h , j · χ T p d e a t h , j , T p d e a t h , j + 1 t   p d e a t h , S t = p d e a t h , S , 0 · p d e a t h t                                                      
The partitions of pcrit and pdeath are assumed independent. Respective numbers of jumps N c r i t and N d e a t h can differ.
In order to find an optimal tradeoff between parsimony and goodness of fit we calculated Bayesian Information criteria (BIC) for different partition numbers N t r , N c r i t , and N d e a t h and chose partitions minimizing BIC.
When new data become available, we attempt to update the numbers of partitions every two weeks by considering adding a new break-point within the last month. The time point as well as the corresponding jump value are considered as two new parameters. We added a new break point only if it improves BIC after the updated parameters estimation.

Appendix C. Output Layer 

We here describe, how the state parameters of the hidden SECIR model are linked with data via the output layer of the IO-NLDS.
Modeling of daily registered infected cases IS,M: The total number of daily registered infected cases IS,M is coupled to the efflux of the first asymptomatic compartments IA,1 and IA,1Mu toward symptomatic compartments multiplied with PS,M.
I S , M T = t = 0 T P S , M · X r 4 , b , r 4 , p s y m p · r 4 , b · I A , 1 t + I A , 1 M u t t
However, we assume a delay of the registered vs. reported cases by introducing an empirical distribution of the reporting delay: When a person has a positive PCR SARS-CoV-2 test result at a date d1, this will result in a registered case at a later date d2. The difference d = d2 − d1 is the reporting delay and is assumed log-normally distributed. This distribution of delays is determined on the basis of data provided by the Robert-Koch-Institute from the period 27 April 2020 to 13 November 2020. The parameters of this distribution are derived by minimizing the Kullback–Leibler divergence between the parametric representation and the empirical distribution. Results are displayed in Figure A1.
Figure A1. Approximation of reporting delay by a log-normal distribution: We present the log-normal distribution best fitting the empirical distribution of reporting delays. Estimated parameters of the log-normal distribution are as follows: μ = 1.77 days, σ = 0.531 days.
Figure A1. Approximation of reporting delay by a log-normal distribution: We present the log-normal distribution best fitting the empirical distribution of reporting delays. Estimated parameters of the log-normal distribution are as follows: μ = 1.77 days, σ = 0.531 days.
Viruses 14 01468 g0a1
Modeling delay of death reporting: In contrast to the newly infected cases, neither information of delays in COVID-19 associated death reporting nor actual dates of deaths were available to us. Therefore, we used a data model proposed by Delagdo et al. [45]. In detail, we assumed that the delay is normally distributed with an average of 7.14 days and a standard deviation of 4 days.
Since we consider time as an integer, we discretize this normal distribution by the approximation DRD(d) = N 7.14 , 4 d i = 1 100 N 7.14 , 4 i   for integers d ≤ 100 and 0 else, where N is the Gaussian distribution function with mean 7.14 days and standard deviation of 4 days, i.e., we neglect delays larger than 100 days. Using this approximation, we derive the actual number of new deaths at time point t:
D a t = t 1 < t D r t · D R D t t 1
Here, Dr(t) is the number of reported new deaths at time t. The function D a t is linked to our compartment D.

Appendix D. Parameter Estimation 

Free parameters of the model are determined by minimizing the negative log-likelihood function of observed data. The negative likelihood is constructed in analogy to [47]. It constitutes of the sum of three components:
n L L = n L L p r i + n L L r e s i d + C o n s t r
The terms n L L i p r i and n L L i r e s i d correspond to prior constraints of parameters and to the residual errors of the data as explained below in detail. The term Constr is a penalty term to keep values in eligible ranges or orders (see “Penalization“). We assume independence between parameters throughout.
Parameter distributions and transformations: Most of the parameters are confined to certain ranges. During estimation (with possible prior constraints), we transform these parameters to the space of real numbers. We assume that these transformed values are normally distributed during Markov-Chain Monte Carlo (MCMC) sampling (see below). To ensure this, parameters confined to a finite interval (a,b) are transformed by the logit-function. Parameters with positive values are transformed by a log-normal transformation. Thus,
φ s = h k ψ s h s ψ s = { e ψ s , f o r   p a r a m e t e r s > 0 a + b a · e ψ s 1 + e ψ s , f o r   p a r a m e t e r s   w i t h i n   [ a , b ] , s = 1 , , N p a r
where φ s is the s-th parameter and ψ s is the respective transformed parameter and N p a r is the total number of parameters to be estimated.
The negative likelihood contribution of the priors n l l i p r i is defined as follows:
n L L p r i = s = 1 N p a r δ s · ψ s ψ s p r i 2 ω p r i , s 2
where δ s equals 1, if a prior is assumed for the s-th parameter and 0 otherwise. The prior information is represented by the “best value” ψ s p r i and an uncertainty expressed as standard deviation of possible values ω p r i , s . We assume that parameter estimates are random variables normally distributed around their respective prior values. Thus,
ψ s ~ N ψ s p r i , ω p r i , s = N h s φ s p r i 1 , ω p r i , s
Prior “best values” and ranges of parameters are provided at Table 2 of the main paper. The uncertainties ω p r i , k are set to 2 for all parameters. This heuristic setting is based on a tradeoff between avoidance of overfitting including implausible parameter values and good data fitting properties.
Penalization: We penalize with a high value of 108 in cases when times of non-pharmaceutical interventions are either too close (closer than 3 days) or non-monotonic. In the same way, we penalize too high dynamical pdeath values (more than 0.66) by multiplication of max(pdeath-0.66,0) with 100.
Residual errors of observed vs. predicted data: We fit data for daily registered cases, cumulative registered cases, deaths, cumulative deaths, and ICU occupation as explained in sub-section “output layer” and the methods section of the main paper. The respective term of the negative log-likelihood n l l i r e s i d corresponds to the residual errors of these data. Thus,
n L L r e s i d = Y o u t w e d Y o u t · j = 1 N d Y o u t d Y o u t , S t r d Y o u t t j , d Y o u t , ψ d Y o u t , D t r d Y o u t t j , d Y o u t 2 a d Y o u t 2 + w e c u m u l · w e Y o u t · j = 1 N d Y o u t Y o u t , S t r Y o u t t j , d Y o u t , ψ Y o u t , D t r Y o u t t j , Y o u t 2 a Y o u t 2
where Y o u t represents the output layers (“dY” corresponds to daily counts, while “Y” corresponds to cumulative counts). Subscript S denotes simulation results, while D corresponds to the data. We sum the negative log-likelihoods of the three outputs considered (infected subjects, critical cases and deaths). Thus, Y o u t represents one of these three entities x with number of data points N x at time points t j , x (j = 1,…,Nx) and residual errors ax. We introduce weights w e c u m u l for the cumulative terms as compared with the daily counts and set it to 0.2. The cumulative terms were introduced to avoid biases of cumulative data occurring after fitting daily data only. Cumulative data for ICU occupation were not fitted, i.e., w e I C U = 0 . The parameter t r Y o u t corresponds to the power transformation used to compare model and data. In the present model version, it is set to 0.5. It constitutes a tradeoff between fitting precision of large and small numbers. All weights w e d Y o u t and w e Y o u t were set to 1.
Thus, we assume that for each output and for each data point the entities d Y o u t , D t r d Y o u t and Y o u t , D t r d Y o u t are normally distributed random variables around respective simulated values with standard deviations being the respective residual errors:
d Y o u t , D t r d Y o u t t j , d Y o u t ~ N d Y o u t , S t r d Y o u t t j , d Y o u t , ψ , a d Y o u t     Y o u t , D t r d Y o u t t j , Y o u t ~ N Y o u t , S t r Y o u t t j , Y o u t , ψ , a Y o u t              
The algorithm to minimize the negative log-likelihood is explained in the next section. Differences of estimated values and their respective priors can be tested by calculating Z-scores ψ s ψ s p r i ω p r i , s .

Appendix E. Algorithm for Parameter Estimations and Prediction Sampling 

Due to nonlinearity of (Equations (A8), (A11) and (A12)), parameters ψ and residual errors θ cannot be estimated simultaneously. For such situations an expectation-maximization (EM) algorithm was proposed by Dempster et al. [58]. This algorithm is a widely applied approach for the iterative computation of maximum likelihood (correspondingly minimum of negative log-likelihood) estimates in incomplete-data statistical problems. In detail, the random parameters ψ = ψ s s = 1 N p a r are considered as non-observed data, while observed data y in our case are defined as follows:
y = I M D t j , I M j = 1 N I M , D D t j , D j = 1 N D , d I M D t j , I M j = 1 N d I M , d D D t j , D j = 1 N d D , d I C U M D t j , I C U j = 1 N d I C U
Complete data of the model is y , ψ . The unknown residual errors θ describe the uncertainty of parameters ψ.
Therefore n L L y , ψ ; θ is a marginal negative log-likelihood likelihood. The complete likelihood nLL is defined as follows:
n L L y ; θ = Ω   n L L y , ψ ; θ d ψ
The EM algorithm minimizes n L L y ; θ iteratively: At the k-th iteration of EM, the expectation step computes the conditional expectation of the complete negative log-likelihood Q k θ = E n L L y , ψ ; θ | y , θ k 1 by generating ψ k based on previous estimates θ k 1 , and the maximization step computes the value θ k maximizing Q k θ . The EM sequence ( θ k ) converges to a stationary point under general regularity conditions [58].
In nonlinear cases, the expectation step cannot be performed in a closed form. Therefore, we applied the Stochastic Approximation algorithm of EM (SAEM). SAEM is a maximum likelihood estimator of the population parameters [48] based on stochastic integration of marginal probabilities without likelihood approximation such as linearization or quadrature approximation or sigma-point filtering [17]. Our implementation is inspired by and is very similar to that of earlier versions of Monolix (Lixoft) software (http://lixoft.com/, accessed on 16 November 2018)
The stochastic approximation version of standard EM algorithm (SAEM) proposed by [48] replaces the usual E-step at an iteration k by a stochastic procedure as follows:
  • Simulation step: draw m k realizations of ψ k = ψ s k s = 1 N p a r from the conditional distribution p · | y ; θ k using MCMC algorithm.
  • Stochastic approximation: update Q k θ
    Q k θ = Q k 1 θ + γ k · 1 m k j = 1 m k l o g p y , ψ k ; θ Q k 1 θ ,
    where γ k is a decreasing sequence of positive numbers.
  • Maximization-step (correspondingly, minimization for negative log-likelihood): update θ k according to
    θ k + 1 = A r g   m i n θ Q k θ
Remarks:
  • Our stochastic approximation step is an improved version of the stochastic approximation of the integration of marginal distribution on the multidimensional domain Ω of possible parameter values:
    Q k θ = E l o g p y , ψ ; θ | y , θ k 1 = Ω   l o g p y , ψ k ; θ k 1 d ψ k
  • In analogy to Monolix software, we selected γ k as follows:
    γ k = 1 , k K 1           γ k = 1 k K 1 + 1 , k > K 1
    We choose K 1 equal to 4 and run the algorithm until convergence with a tolerance 0.1% of estimates of population parameters (see below).
  • We performed MCMC sampling 4000 times at each stage with a burn-in phase of 1000 steps. Thus, m k = 3000 .
Exact estimates of different components of θ k are:
a d I M k = 1 N d I M j = 1 N d I M Ω   d I M S t j , I M , ψ k d I M D t j , I M 2 d ψ k a d D k = 1 N d D j = 1 N d D Ω   d D S t j , D , ψ k d D D t j , D 2 d ψ k a d I C U k = 1 N d I C U j = 1 N d I C U Ω   d I C U S t j , I C U , ψ k d I C U D t j , I C U 2 d ψ k
Therefore, respective stochastic approximations and maximizations θ k are as follows:
s 1 , j , k = s 1 , j , k 1 + γ k · 1 m k r = 1 m k d I M S t j , I M , ψ k , r d I M D t j , I M 2 s 1 , j , k 1 , j = 1 , , N d I M a d I M k = j = 1 N d I M s 1 , i , j , k N d I M                                                                                                                          
s 2 , j , k = s 2 , j , k 1 + γ k · 1 m k r = 1 m k d D S t j , I M , ψ k , r d D D t j , I M 2 s 2 , j , k 1 , j = 1 , , N d D a d D k = j = 1 N d D s 2 , i , j , k N d D                                                                                                                            
s 3 , j , k = s 3 , j , k 1 + γ k · 1 m k r = 1 m k d I C U S t j , I M , ψ k , r d I C U D t j , I M 2 s 3 , j , k 1 , j = 1 , , N d I C U a d I C U k = j = 1 N d I C U s 3 , i , j , k N d I C U                                                                                                                                
In the same way, the respective terms for the cumulative data approximations are derived.

Appendix F. MCMC Algorithm for the Expectation Step 

Markov chain Monte Carlo (MCMC) methods comprise a class of algorithms for sampling from a probability distribution [59]. By constructing a Markov chain that has the desired distribution as its equilibrium distribution, one can obtain a sample of the desired distribution by recording states from the chain. It is well-known that a proper choice of a proposal distribution for MCMC methods is a crucial factor for convergence of the algorithm [60]. For the sake of increasing the acceptance rate, a number of adaptive Metropolis (AM) algorithms were proposed by different groups. Here the proposal distribution is learned along the process using the full information cumulated so far. We implemented the adaptive MCMC version with Gaussian proposal distribution described in [60] as well as adaptive incremental Mixture MCMC [56] called AIMM, which we modified slightly. Strictly speaking, these methods are not really Markov chains, because proposal distribution of the next step depends on all preceding states X t 0 t rather than only the previous one. The algorithm of Haario et al. is simpler and it assumes the existence of a global minimum of nLL. In contrast, the algorithm of Maire et al. could be useful for cases when the nLL has a complex topology due to overfitting.
Let π denote the target distribution (i.e., negative log- likelihood) given by Equation (A8). At each iteration step the new parameter vector Y is generated by a transition kernel representing the proposal distribution. This candidate vector is accepted with probability
α X t 1 , Y = min 1 , π Y π X t 1
The transition kernel of Haario’s MCMC version is an empirical covariance matrix of previous samples stabilized by an identity matrix multiplied by a small number ε :
C t k = s d · c o v ψ k , 1 , , ψ k , t 1 + ε · I d , t m k ,
where t is a sampling number and s d = 2.4 N p a r i n d . Here, we choose ε = 0.0001 . This parameter is required to ensure ergodic property of the Markov chain. When k > 1, we added samples from the previous step to the covariance matrix:
C t k = s d · c o v ψ k 1 , 1 , , ψ k 1 , m k , ψ k , 1 , , ψ k , t 1 + ε · I d , t m k
At each iteration k we used a sample from the previous iteration providing a small value of nLL as starting point.
In the AIMM, the proposal distributions Q t are mixtures of multivariate normal distributions. Roughly spoken, this is a generalization of Haario’s algorithm when multiple local minima of the nLL exist in few clusters. The candidate vector is accepted with probability
α X t 1 , Y = m i n 1 , π Y Q t π X t 1 Q t 1
This algorithm minimizes discrepancies between proposal and target probability i.e., a sequence Q t converges to π by approximating it through mixtures of multivariate normal distributions. The elements of this series Q t are defined as follows:
Q t = l = 1 M t β l · φ l l = 1 M t β l ,
where M t is the number of components at the iteration t. The elements φ 1 , , φ M t represent the incremental mixture components, β 1 , , β M t are the respective weights. Each mixture component consists of a mean vector and a covariance matrix. The sampling from Q t proceeds as follows: We choose the r-th component with a probability β r l = 1 M t β l by generating a uniformly distributed random number and accepting the r-th component if this number is in between β r 1 l = 1 M t β l , β r l = 1 M t β l if r > 1 or in between 0 , β r l = 1 M t β l if r > 1. After the choice of the r-th component, a random parameter vector Y is generated around the r-th mean according to the r-th covariance matrix as in Haario’s algorithm. If Y is accepted, it becomes X t . X t can either stay in the r-th cluster or give origin for the new cluster φ M t with M t = 1 + M t 1 . A new cluster is created when the match of φ M t to the r-th cluster is insufficient based on Mahalanobis distance. If X t stays in the r-th cluster, it updates the r-th covariance matrix in a similar way as in Haario’s algorithm [60], Equation (A26).
We here modified the conditions for new cluster formation compared to [56] as follows. In our algorithm, a new cluster is formed when one of the following conditions hold:
  • The Mahalanobis distance of X t to the cluster from which it was generated is less than 0.025 or larger than 0.975. That is X t diverges significantly from the current multivariate normal distribution of the r-th cluster
  • π X t is significantly larger than π of the current cluster center. That is X t does not correspond to the local maximum of π in the neighbourhood of the r-th cluster.
If one of the above conditions holds, X t becomes the center of a new cluster. The respective Gaussian component is the covariance matrix of the r-th (i.e., previous) cluster. This matrix will be further updated every time when new members of the new cluster are accepted in future proposals.
The weights β 1 , , β M t are proportional to π of the respective cluster centers to the power of γ, where γ is a positive number less than 1. All weights are updated every time when a new cluster emerges.
In summary, AIMM accepts proposals with discrepancies to the target distribution. As a consequence, proposal distributions are multivariate normal mixtures. Every cluster’s mean is a local maximum of π. Sampling of proposal distributions from clusters depends on π. New clusters emerge when an accepted proposal either significantly diverges from the respective cluster’s probability or when a significantly better optimal value is found in this cluster.
After thorough comparison of adaptive MCMC and the adaptive incremental mixture MCMC, we found the latter to be superior. Higher values of π were found in a shorter time. It also generates higher acceptance rates (0.2–0.3 versus 0.1) and finds more alternative solutions. We therefore used this method for our parameter estimations.
We applied Geweke convergence diagnostics for Markov chains [61] as implemented in the R-package coda (https://cran.r-project.org/web/packages/coda/coda.pdf, accessed on 01 October 2020). This method is based on a test for equality of the means of the first and last part of a Markov chain (by default the first 10% and the last 50%). If the samples are drawn from a stationary distribution of the chain, the two means are equal and Geweke’s statistic has an asymptotical standard normal distribution. The test statistic is a standard Z-score: the difference between the two sample means divided by its estimated standard error. The standard error is estimated from the spectral density at zero taking autocorrelation into account. The Z-score is calculated under the assumption that the two parts of the chain are asymptotically independent. We applied this diagnostic for the nLL resulting from the last run of the implemented adaptive MCMC algorithm, resulting in a chain of 2620 steps. We considered default fractions of the chain, i.e., 0.1 and 0.5 of the beginning and from end of chain, respectively. The resulting Z-score is −0.2536, corresponding to p-value of 0.4, i.e., no deviations of means were detected suggesting that a stationary distribution is achieved.
Figure A2 (generated by function geweke.plot from the coda package) shows the development of Geweke’s Z-score when successively larger numbers of iterations are discarded from the beginning of the chain. The Z-score remains always in the 95% confidence interval, suggesting a successful convergence.
Figure A2. Development of Geweke’s Z-score when successively larger numbers of iterations are discarded from the beginning of the MCMC chain. Dashed lines corresponds to quantiles of 0.025 and 0.095 for the Z-score. No local trends are detected, i.e., a stationary sampling distribution is achieved.
Figure A2. Development of Geweke’s Z-score when successively larger numbers of iterations are discarded from the beginning of the MCMC chain. Dashed lines corresponds to quantiles of 0.025 and 0.095 for the Z-score. No local trends are detected, i.e., a stationary sampling distribution is achieved.
Viruses 14 01468 g0a2

Appendix G. MCMC Simulation for Prediction and Controlling Goodness of Fit 

The estimates of residual errors are determined at the last step and are used for MCMC sampling of parameters ψ . The resulting means and standard deviations are considered as respective average estimates and their standard errors. Simulations of these parameter samples provides a set of alternative predictions. From these, we collected the best fitting solution, the average solution and confidence intervals for different confidence limits α.

Appendix H. Justification of Prior Parameters and Ranges 

We here provide justifications of assumed prior values and parameter ranges. Details of parameters definition and fitting can be found in Appendix B and Appendix D, respectively.
Initial influx of people per day Influx: The initial influx was estimated from the data without prior assumptions to a value of 6937 people per day in order to initialize the simulation. Later, the parameter is no longer relevant for simulation outcomes.
Infection rate through asymptomatic subjects per day r1: This infection rate was estimated from the data without prior assumptions. It represents the basic transmission probability of the SARS-CoV-2 virus from an asymptomatic infectious person to a susceptible contact.
Infection rate through symptomatic subjects per day r2: This infection rate was estimated from the data without prior assumptions. It represents the basic transmission probability of the SARS-CoV-2 virus from a symptomatic infectious person to a susceptible contact.
Relative infection intensity of asymptomatic subjects per day b1(t): The infectivity of asymptomatic infected subjects was assumed as a time-dependent step-function due to changing NPIs/contact behavior and other factors influencing infection probabilities. Steps were estimated from the data without prior assumptions.
Relative infection intensity of symptomatic subjects per day b2(t): The infectivity of symptomatic infected subjects was assumed as a time-dependent step-function due to changing NPIs/contact behavior and other factors influencing infection probabilities. Steps were estimated from the data without prior assumptions.
Ratio of b 1 t and b 2 t ( r b 1 , 2 ). We assumed a fixed ratio of the infectivities of asymptomatic and symptomatic infected subjects for the sake of parsimony. The ratio was estimated from the data without prior assumptions.
Fixed dates for updates of infectivity functions: We used several fixed dates of changes in infectivity functions due to known changes in NPIs, testing policy or outbreaks. Note that even fixed time points were checked for necessity to assume changes in infectivity for the sake of parsimony, i.e., respective steps were only assumed if significantly improving model fit. The first three fixed time-points, tr1 (10 March 2020), tr2 (15 March 2020), and tr3 (22 March 2020) reflect German governmental interventions including regulation of the size of public events, travel restrictions, and contact restriction. Fixed time-points tr6 (30 April 2020), tr7 (7 May 2020), and tr8 (21 May 2020) reflect German governmental interventions related to the step-wise relaxation of NPIs, in particular regarding leisure sports, contacts, and schools. Time point tr17 (2 November 2020) reflects governmental NPIs in response to the German second wave, including restrictions of public life and social contacts, also referred as “soft lockdown”. Time point tr21 (16 December 2020) reflects further stricter governmental NPIs in response to the ongoing increase of the German second wave, also referred as “hard lockdown”, strongly limiting public and private contacts including school closures. Finally, time point tr28 (23 February 2021) reflects release of many governmental NPIs in response to the decline of the German second wave.
Transit rate for compartment E (latent time) r3: The transition rate r3 for the compartment of exposed subjects is the inverse of the latent time, i.e., the time being infected but not yet infectious. The mean of the prior distribution for the latent time was set to 3 days and the minimum and maximum of the distribution was set to 2 and 4 days, respectively, in accordance with previous reports [10]. Note that minimum and maximum of a parameter’s distribution in this section always refer to the distribution of the mean of the parameter, not of the distribution of the parameter itself. Further justification of this parameter is discussed in the following when considering the rate r4b.
Transit rate for asymptomatic sub-compartments r4: The transition rate r4 for the asymptomatic infectious compartment to the recovered compartment is a third of the inverse of the time being asymptomatic and infectious, as this compartment is split into three sub-compartments. The mean of the prior distribution of r4 was set to 3/5 per day and the minimum and maximum of the distribution was set to 3/10 and 3/4 per day, respectively. These values are based on general considerations regarding timelines of the germinal center reaction [25] and further supported by reports from the literature estimating relevant infectiousness periods in general or asymptomatic/mildly symptomatic COVID-19 patients as in between 3.5 and 9.5 days [22,23,24].
Rate of development of symptoms after infection r4b: The inverse of this rate is equal to the time from being infectious to start of developing symptoms. The mean of the prior distribution of r4b was set to ½.5 per day and the minimum and maximum of the distribution was set to 1/5 and 1/1 per day, respectively. This is in line with previous reports [10,26,27]. Note that the serial interval, i.e., the average time between successive cases in a chain of transmission is composed of two parameters of our model. In detail, the serial interval is the sum of the average latent time (1/r3) and half of the average time being infectious when assuming random occurrence of subsequent infections during time of infectiousness. Exemplarily, if the serial interval would be considered in a scenario where symptomatic individuals are immediately and effectively quarantined, the serial interval would be 1/r3 + 0.5*1/r4b. The serial interval was estimated by the RKI [28] to have a median of 4 days (interquartile range 3–5 days) based on the literature [18,19,20,21], which is in accordance with our choices for r3 and r4b. However, the serial interval (and other parameters like the time being infectious) are to some extent also time dependent, reflecting e.g., behavioral changes. Although we do not model a time dependence for these specific parameters, our model can, to a certain extent, cope for this by data-driven adaptation of other time-dependent parameters such as b1 and b2.
Probability of developing symptomatic disease after infection psymp: This probability was estimated from the literature, reporting a percentage of symptomatic COVID-19 cases in between 55% and 85% [37,38,39]. We used a percentage of 50% as mean of the prior distribution, accounting for the fact that minor symptoms are frequently ignored or considered as symptoms of a common cold. Minimum and maximum was set to 0.3 and 0.8, respectively.
Transit rate of symptomatic sub-compartments r5: The transition rate r5 for the three symptomatic sub-compartments towards recovery is a third of the inverse of time being symptomatic and infectious. The mean of the prior distribution of r5 was set to 3/2.5 per day and the minimum and maximum of the distribution was set to 3/7.5 and 3/1.5 per day, respectively. These values are based on the assumption that symptomatic and asymptomatic subjects are similar with respect to time of contagiousness. Hence, values of the distribution of r5 equal that of r4 subtracted by the mean value of r4b.
Rate of development of critical state after becoming symptomatic r6: The inverse of this rate is assumed equal to the time of developing a critical state after being infectious and symptomatic. The mean of the prior distribution of r6 was set to 1/5 per day and the minimum and maximum of the distribution was set to 1/7 and 1/4 per day, respectively, according to previous reports [10,29,30,31]. Note that the probability of people becoming critical is affected by the function pcrit.
Probability of becoming critical after developing symptoms pcrit: This probability is assumed as a time-dependent step-function reflecting for example changing age-distributions of infected subjects or treatment efficacy. Steps were estimated from the data within the range of 0 to 1 without assuming a specific prior. The initial value of pcrit,0 was estimated as 0.075, which is within the range of reported values [10,62].
Transit rate for critical state sub-compartments r7: The transition rate r7 for the critical state sub-compartments is a third of the inverse of the time treated in intensive care unit (ICU) for survivors, as the critical compartment is also split into three sub-compartments. The mean of the prior distribution of r7 was set to 3/17 per day and the minimum and maximum of the distribution was set to 3/35 and 3/8 per day, respectively. These values are informed by previous reports focusing on data of 35 to 79-year-old patients, the most frequent population in ICU [10,32,33,34].
Death rate of patients in critical sub-compartment r8: This transition rate represents the rate from the first ICU sub-compartment to the death compartment. It is the inverse of the time of patients in ICU that passed away. The mean of the prior distribution of r8 was set to 1/8 per day and the minimum and maximum of the distribution was set to 1/14 and 1/6.5 per day, respectively. This reflects the shorter time in ICU for patients with fatal disease outcome informed by previous reports [29,35,36]. The number of people with fatal disease course is affected by two additional parameters pdeath and pdeath,S explained below.
Probability of death after becoming critical pdeath: This is the probability of death for patients at ICU. It is assumed as a time-dependent step-function estimated from data. Values are restricted within the range 0 to 1 without specific prior assumptions. Changes in time reflect for example changes in age-composition of ICU patients as well as changes in treatment regimens. The initial value is pdeath,0 = 0.118.
Probability of death after developing symptoms without becoming critical pdeath,S: To reflect COVID-19 related deaths outside of ICU (especially relevant for the oldest age-groups [32]), we introduced the probability pdeath,S of transitioning from the second symptomatic sub-compartment to the death compartment. This probability was estimated from the data pdeath,S = 0.0448.
Fraction of unreported cases pS,M: For the fraction of infected cases that are symptomatic but not reported, we used a prior distribution with a mean of 0.5, a minimum of 0.3 and a maximum of 0.9. This choice was informed by studies of SARS-CoV-2 seroprevalence in Germany [40,41]. Note that the total percentage of unreported infected people is 1-pS,M∙psymp according to the definition of psymp.
The factor mur is multiplied to r1 and r2 reflecting higher infectivity of the B.1.1.7 variant compared to the previous variants. This parameter was estimated from sequencing data reporting the dynamics of the increase of variant B.1.1.7 in the UK, Denmark, Belgium, Suisse, and the United States, available from https://github.com/tomwenseleers/newcovid_belgium/, accessed on 13 April 2022) and Germany, available from “Mutationstracking-Projekt von Sven Schmidt” at https://tinyurl.com/36xnmxat, accessed on 17 May 2022). Thereby, mur was calibrated to match the observed average dynamic of the increase of B.1.1.7 across countries resulting in a value of mur = 1.7.

Appendix I. Parameter Values for Germany 

Table A1. Time points of changes in infectivity and respective steps. We used fixed (known due to Governmental decisions or random events) and estimated time points of NPI/contact behavior changes and events and respective changes in infectivity of asymptomatic subjects. We provide estimates and relative standard errors of the infectivity. For estimated time points, we also provide the respective standard error (last column).
Table A1. Time points of changes in infectivity and respective steps. We used fixed (known due to Governmental decisions or random events) and estimated time points of NPI/contact behavior changes and events and respective changes in infectivity of asymptomatic subjects. We provide estimates and relative standard errors of the infectivity. For estimated time points, we also provide the respective standard error (last column).
NumberType of NPI/Contact Behaviour ChangeEstimated
New Infectivity
Relative Standard Error, %Date Source of Time PointStandard Error (Days)
1Intensification0.6760.73810 March 2020Fixed-
2Intensification0.1503.9915 March 2020Fixed-
3Relaxation0.2140.71122 March 2020Fixed-
4Intensification0.1312.7929 March 2020Estimated0.280
5Relaxation0.1722.7823 April 2020Estimated0.164
6Relaxation0.2000.46230 April 2020Fixed-
7Intensification0.1095.787 May 2020Fixed-
8Relaxation0.1775.1314 May 2020Fixed-
9Intensification0.1630.27822 May 2020Estimated0.322
10Relaxation0.4340.6445 June 2020Estimated0.387
11Intensification0.1423.6713 June 2020Estimated0.360
12Relaxation0.2702.801 July 2020Estimated0.251
13Intensification0.1931.0611 August 2020Estimated0.244
14Relaxation0.2561.0128 August 2020Estimated0.264
15Relaxation0.3571.061 October 2020Estimated0.119
16Intensification0.2460.96719 October 2020Estimated0.334
17Intensification0.1983.982 November 2020Fixed-
18Relaxation0.2130.99111 November 2020Estimated1.08
19Relaxation0.2560.55024 November 2020Estimated0.262
20Intensification0.2481.611 December 2020Estimated0.303
21Intensification0.1182.1816 December 2020Fixed-
22Relaxation0.4211.4026 December 2020Estimated0.238
23Intensification0.1543.481 January 2021Estimated0.118
24Relaxation0.18211.012 January 2021Estimated-
25Relaxation0.2373.096 February 2021Estimated-
26Intensification0.2114.0815 February 2021Estimated-
27Relaxation0.2332.9825 February 2021Estimated-
28Relaxation0.22823.218 March 2021Fixed-
Table A2. Determination of the number of time steps of input step functions. We analyzed different numbers of steps for the step functions pcrit (Ncrit) and pdeath (Ndeath). Npar = number of parameters to be estimated, nLL = negative log-likelihood, BIC = Bayesian information criterion. A total of 1714 data points were analyzed (348 new cases and death cases measurements daily and cumulative, 322 measurements of daily critical cases). The combination Ncrit = 18 and Ndeath = 19 resulted in the lowest BIC, i.e., best compromise between model parsimony and fit. The best solution resulted from estimation of 134 parameters as follows: 15 basic parameters (Table A5), 28 of infectivity changes at 19 time points (Table A1), 18 values for α c r i t , i with respect to 17 time points and 19 values for α d e a t h , i with respected to 18 time points. Alternative assumptions on Ncrit and Ndeath resulted in respective changes of the total number of parameters. The best results are in bold.
Table A2. Determination of the number of time steps of input step functions. We analyzed different numbers of steps for the step functions pcrit (Ncrit) and pdeath (Ndeath). Npar = number of parameters to be estimated, nLL = negative log-likelihood, BIC = Bayesian information criterion. A total of 1714 data points were analyzed (348 new cases and death cases measurements daily and cumulative, 322 measurements of daily critical cases). The combination Ncrit = 18 and Ndeath = 19 resulted in the lowest BIC, i.e., best compromise between model parsimony and fit. The best solution resulted from estimation of 134 parameters as follows: 15 basic parameters (Table A5), 28 of infectivity changes at 19 time points (Table A1), 18 values for α c r i t , i with respect to 17 time points and 19 values for α d e a t h , i with respected to 18 time points. Alternative assumptions on Ncrit and Ndeath resulted in respective changes of the total number of parameters. The best results are in bold.
N c r i t N d e a t h N p a r nLLBIC
181913426206238
171713226616298
171913326456280
181813326336256
192013626166245
191913526186241
Table A3. Step functions of pcrit and pdeath. We present estimates for the single steps of the functions pcrit and pdeath at the specified dates and respective standard errors. We also provide the standard error of the estimated time point (last column).
Table A3. Step functions of pcrit and pdeath. We present estimates for the single steps of the functions pcrit and pdeath at the specified dates and respective standard errors. We also provide the standard error of the estimated time point (last column).
ParameterDescriptionEstimateRelative Standard Error. %Date Respective ControlsStandard Error (Days)
α c r i t , 1 Relative values of p c r i t starting at the respective date1.050.31720 March 20200.0844
α c r i t , 2 2.483.181 April 20200.14
α c r i t , 3 2.243.466 May 20201.06
α c r i t , 4 1.223.204 June 20202.03
α c r i t , 5 0.8840.6266 July 20203.75
α c r i t , 6 0.3442.0730 July 20201.14
α c r i t , 7 0.3400.38124 August 20206.94
α c r i t , 8 0.3014.2520 September 20200.705
α c r i t , 9 0.2381.156 October 20201.52
α c r i t , 10 0.3301.0323 October 20201.42
α c r i t , 11 0.3820.8018 November 20200.870
α c r i t , 12 0.4191.7020 November 20206.20
α c r i t , 13 0.6331.5323 December 20201.43
α c r i t , 14 0.6511.511 January 20210.506
α c r i t , 15 0.9291.1222 January 20213.43
α c r i t , 16 0.6473.4113 February 20213.08
α c r i t , 17 0.3940.9725 March 20216.15
α c r i t , 18 0.44162.818 March 2021-
α d e a t h , 1 Relative values of p d e a t h starting at the respective date2.391.8826 March 20200.164
α d e a t h , 2 3.581.1723 April 20200.283
α d e a t h , 3 1.944.5519 May 20201.45
α d e a t h , 4 0.7431.1910 June 20200.393
α d e a t h , 5 0.2963.255 July 20205.72
α d e a t h , 6 0.4010.63527 July 20206.15
α d e a t h , 7 0.1421.2225 August 20204.51
α d e a t h , 8 0.4737.4617 September 20201.20
α d e a t h , 9 0.3146.398 October 20201.56
α d e a t h , 10 0.6380.9661 November 20202.18
α d e a t h , 11 1.410.74822 November 20201.17
α d e a t h , 12 1.642.5311 December 20201.89
α d e a t h , 13 2.541.3529 December 20200.499
α d e a t h , 14 2.663.017 January 20216.02
α d e a t h , 15 3.486.4218 January 20211.43
α d e a t h , 16 2.314.805 February 20210.794
α d e a t h , 17 1.223.1527 February 20212.315
α d e a t h , 18 0.8072.1509 March 20213.75
α d e a t h , 19 1.0969.119 March 2021-
Table A4. Residual errors of observables. We present the residual errors of fitting our model to the time frame 3 March 2020 to 21 March 2021. dIM = daily incident cases, IM = cumulative, dICU = daily occupation of ICU beds dD = daily death, D = cumulative delay. Case numbers were square root transformed, i.e., units of values are cases to the power of 0.5.
Table A4. Residual errors of observables. We present the residual errors of fitting our model to the time frame 3 March 2020 to 21 March 2021. dIM = daily incident cases, IM = cumulative, dICU = daily occupation of ICU beds dD = daily death, D = cumulative delay. Case numbers were square root transformed, i.e., units of values are cases to the power of 0.5.
ParameterValue for GermanyValue for Saxony
a d I M 3.620.921
a I M 5.691.03
a d I C U 1.190.442
a d D 3.040.377
a D 0.991.14
Table A5. Parameter estimates and comparison with average priors. We present estimated parameters of the SECIR model and initial conditions of control parameters and their respective standard errors for Germany. We also perform a formal comparison of estimates and expected priors using t-test.
Table A5. Parameter estimates and comparison with average priors. We present estimated parameters of the SECIR model and initial conditions of control parameters and their respective standard errors for Germany. We also perform a formal comparison of estimates and expected priors using t-test.
ParameterDescriptionPosterior Estimate Relative Standard Error, %Prior Valuep-Value
influxInitial influx of infections into compartment E until first interventions31713.12--
r 1 Infection rate through asymptomatic subjects1.190.582--
r 3 Transit rate for compartment E (latent time)0.2720.05711/30.213
r 4 Transit rate for asymptomatic sub-compartments0.6360.7343/50.429
r 4 , b Rate of development of symptoms after infection 0.4562.171/2.50.346
r 5 Transit rate for symptomatic sub-compartments0.9462.333/2.50.499
r 6 Rate of development of critical state after being symptomatic0.1860.4051/50.457
r 7 Transit rate for critical state sub-compartment0.1590.3363/170.402
r 8 Death rate of patients in critical sub-compartment 10.1040.4091/80.441
r b 1 , 2 Proportionality coefficient of inten-sifications/relaxations between b 1 and b 2 0.3799.18--
PS,MFraction of reported cases0.4990.1021/2
p c r i t   ( p c r i t , 0 ) Probability of becoming critical after developing symptoms (initial value)0.07650.706--
p d e a t h   ( p d e a t h , 0 )Probability of death after becoming critical (initial value)0.1191.24--
p d e a t h , S , 0 Proportionality coefficient for evaluating probability of death after developing symptoms without becoming critical, see (A6)0.5878.04--

Appendix J. Parameter Values for Saxony 

Table A6. Time points of changes in infectivity and respective values for Saxony. We used fixed (known due to Governmental decisions or random events) and estimated time points of changes of NPI/contact behavior and events and respective changes in infectivity of asymptomatic subjects. We provide estimates and relative standard errors of the infectivity starting with the date mentioned (3 to 5 column).
Table A6. Time points of changes in infectivity and respective values for Saxony. We used fixed (known due to Governmental decisions or random events) and estimated time points of changes of NPI/contact behavior and events and respective changes in infectivity of asymptomatic subjects. We provide estimates and relative standard errors of the infectivity starting with the date mentioned (3 to 5 column).
NumbersType of NPI/Behavior ChangeEstimated
New Infectivity
Relative Standard Error, %Date Source Standard Error (Days)
1Intensification0.6060.87710 March 2020Fixed-
2Intensification0.1205.4115 March 2020Fixed-
3Intensification0.09041.1522 March 2020Fixed-
4Relaxation0.1031.982 April 2020Estimated0.541
5Intensification0.09073.1214 April 2020Estimated0.237
6Relaxation0.3020.96530 April 2020Fixed-
7Intensification0.06066.087 May 2020Fixed-
8Intensification0.03854.2114 May 2020Fixed-
9Relaxation0.06010.19919 May 2020Estimated0.487
10Relaxation0.8170.5054 June 2020Estimated0.603
11Intensification0.03444.1811 June 2020Estimated0.456
12Relaxation0.2193.2330 June 2020Estimated0.298
13Intensification0.1491.1316 August 2020Estimated0.312
14Relaxation0.2132.2926 August 2020Estimated0.578
15Relaxation0.2970.784 October 2020Estimated0.209
16Intensification0.1851.2621 October 2020Estimated0.352
17Intensification0.1525.9330 October 2020Fixed-
18Relaxation0.2010.82611 November 2020Estimated1.21
19Relaxation0.2070.65219 November 2020Estimated0.318
20Intensification0.2012.1322 November 2020Estimated0.554
21Intensification0.06721.8710 December 2020Fixed-
22Relaxation0.2281.3618 December 2020Estimated0.426
23Intensification0.09375.091 January 2021Estimated0.141
24Relaxation0.1209.7814 January 2021Estimated-
25Relaxation0.22910.15 February 2021Estimated-
26Intensification0.15011.515 February 2021Estimated-
27Relaxation0.1990.9526 February 2021Estimated-
28Relaxation0.21025.718 March 2021Fixed-
Table A7. Step functions of pcrit and pdeath for Saxony. We present estimates for the steps of the functions pcrit and pdeath at the specified dates and respective standard errors for Saxony. We also provide the standard error of the estimated time point (last column).
Table A7. Step functions of pcrit and pdeath for Saxony. We present estimates for the steps of the functions pcrit and pdeath at the specified dates and respective standard errors for Saxony. We also provide the standard error of the estimated time point (last column).
ParameterDescriptionEstimateRelative Standard Error, %Date Respective ControlsStandard Error (Days)
α c r i t , 1 Relative values of p c r i t starting at the respective date2.150.9824 March 20200.34
α c r i t , 2 1.994.2210 April 20200.672
α c r i t , 3 1.013.5411 May 20201.25
α c r i t , 4 2.542.495 June 20203.73
α c r i t , 5 1.501.262 July 20204.36
α c r i t , 6 1.193.4127 July 20200.75
α c r i t , 7 0.7640.47829 August 20204.93
α c r i t , 8 0.3985.1218 September 20202.96
α c r i t , 9 0.3002.0925 September 20202.12
α c r i t , 10 0.5282.1613 October 20200.49
α c r i t , 11 0.9083.7226 October 20201.15
α c r i t , 12 0.9992.431 December 20205.31
α c r i t , 13 1.761.9126 December 20202.06
α c r i t , 14 2.011.5610 January 20210.67
α c r i t , 15 2.990.9825 January 20212.15
α c r i t , 16 2.685.7713 February 20214.12
α c r i t , 17 1.111.335 March 20215.11
α c r i t , 18 0.70079.24 March 2021-
α d e a t h , 1 Relative values of p d e a t h starting at the respective date0.6552.264 April 20200.241
α d e a t h , 2 3.586.8124 April 20200.335
α d e a t h , 3 1.945.3217 May 20201.01
α d e a t h , 4 0.7431.078 June 20200.619
α d e a t h , 5 0.2964.327 July 20205.60
α d e a t h , 6 0.4011.564 August 20206.13
α d e a t h , 7 0.1426.7726 August 20204.43
α d e a t h , 8 0.4739.0527 September 20200.95
α d e a t h , 9 0.3141.423 October 20201.27
α d e a t h , 10 0.6380.842 November 20203.62
α d e a t h , 11 1.410.916 November 20201.19
α d e a t h , 12 1.642.311 December 20201.63
α d e a t h , 13 2.541.55520 December 20200.903
α d e a t h , 14 2.663.898 January 20215.52
α d e a t h , 15 3.485.5319 January 20211.08
α d e a t h , 16 2.314.909 February 20210.383
α d e a t h , 17 1.222.7626 February 20212.06
α d e a t h , 18 0.8074.037 March 20215.34
α d e a t h , 19 1.0970.111 March 2021-
Table A8. Parameter estimates and comparison with average priors for the parameter settings for Saxony. We present estimated parameters of the SECIR model and initial conditions of control parameters and their respective standard errors for the parametrization of the epidemic in Saxony. We also perform a formal comparison of estimates and expected priors using t-test.
Table A8. Parameter estimates and comparison with average priors for the parameter settings for Saxony. We present estimated parameters of the SECIR model and initial conditions of control parameters and their respective standard errors for the parametrization of the epidemic in Saxony. We also perform a formal comparison of estimates and expected priors using t-test.
ParameterDescriptionPosterior Estimate Relative Standard Error, %Prior Valuep-Value
influxInitial influx of infections into compartment E until first interventions68.16.17--
r 1 Infection rate through asymptomatic subjects1.611.32--
r 3 Transit rate for compartment E (latent time)0.2700.2341/30.221
r 4 Transit rate for asymptomatic sub-compartments0.6970.6913/50.357
r 4 , b Rate of development of symptoms after infection 0.2943.271/2.50.489
r 5 Transit rate for symptomatic sub-compartments1.112.133/2.50.236
r 6 Rate of development of critical state after being symptomatic0.1701.461/50.495
r 7 Transit rate for critical state sub-compartment0.1980.6593/170.372
r 8 Death rate of patients in critical sub-compartment 10.1401.331/80.393
r b 1 , 2 Proportional coefficient of intensifications/relaxations between b 1 and b 2 0.24815.5--
PS,MFraction of reported cases0.5095.371/2
p c r i t ( p c r i t , 0 ) Probability of becoming critical after developing symptoms (initial value)0.07941.76--
p d e a t h ( p d e a t h , 0 )Probability of death after becoming critical (initial value)0.1370.957--
p d e a t h , S , 0 Proportionality coefficient for evaluating probability of death after developing symptoms without becoming critical, see (A6)0.7197.3--

References

  1. Adiga, A.; Dubhashi, D.; Lewis, B.; Marathe, M.; Venkatramanan, S.; Vullikanti, A. Mathematical Models for COVID-19 Pandemic: A Comparative Analysis. J. Indian Inst. Sci. 2020, 100, 793–807. [Google Scholar] [CrossRef]
  2. Tang, J.; Vinayavekhin, S.; Weeramongkolkul, M.; Suksanon, C.; Pattarapremcharoen, K.; Thiwathittayanuphap, S.; Leelawat, N. Agent-Based Simulation and Modeling of COVID-19 Pandemic: A Bibliometric Analysis. J. Disaster Res. 2022, 17, 93–102. [Google Scholar] [CrossRef]
  3. Kucharski, A.J.; Klepac, P.; Conlan, A.J.K.; Kissler, S.M.; Tang, M.L.; Fry, H.; Gog, J.R.; Edmunds, W.J.; Emery, J.C.; Medley, G.; et al. Effectiveness of isolation, testing, contact tracing, and physical distancing on reducing transmission of SARS-CoV-2 in different settings: A mathematical modelling study. Lancet Infect. Dis. 2020, 20, 1151–1160. [Google Scholar] [CrossRef]
  4. Quilty, B.J.; Clifford, S.; Hellewell, J.; Russell, T.W.; Kucharski, A.J.; Flasche, S.; Edmunds, W.J.; E Atkins, K.; Foss, A.M.; Waterlow, N.R.; et al. Quarantine and testing strategies in contact tracing for SARS-CoV-2: A modelling study. Lancet Public Health 2021, 6, e175–e183. [Google Scholar] [CrossRef]
  5. Rahimi, I.; Chen, F.; Gandomi, A.H. A review on COVID-19 forecasting models. Neural Comput. Appl. 2021. [Google Scholar] [CrossRef]
  6. Flaxman, S.; Mishra, S.; Gandy, A.; Unwin, H.J.T.; Mellan, T.A.; Coupland, H.; Whittaker, C.; Zhu, H.; Berah, T.; Eaton, J.W.; et al. Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe. Nature 2020, 584, 257–261. [Google Scholar] [CrossRef]
  7. Bo, Y.; Guo, C.; Lin, C.; Zeng, Y.; Li, H.B.; Zhang, Y.; Hossain, S.; Chan, J.W.; Yeung, D.W.; Kwok, K.O.; et al. Effectiveness of non-pharmaceutical interventions on COVID-19 transmission in 190 countries from 23 January to 13 April 2020. Int. J. Infect. Dis. 2020, 102, 247–253. [Google Scholar] [CrossRef]
  8. Khailaie, S.; Mitra, T.; Bandyophadhyay, A.; Schips, M.; Mascheroni, P.; Vanella, P.; Lange, B.; Binder, S.C.; Meyer-Hermann, M. Development of the reproduction number from coronavirus SARS-CoV-2 case data in Germany and implications for political measures. BMC Med. 2020, 19, 32. [Google Scholar] [CrossRef]
  9. Barbarossa, M.V.; Fuhrmann, J.; Meinke, J.H.; Krieg, S.; Varma, H.V.; Castelletti, N.; Lippert, T. Modeling the spread of COVID-19 in Germany: Early assessment and possible scenarios. PLoS ONE 2020, 15, e0238559. [Google Scholar] [CrossRef]
  10. der Heiden, M.; an Buchholz, U. Modellierung von Beispielszenarien der SARS-CoV-2-Epidemie 2020 in Deutschland; Robert Koch-Institut: Berlin, Germany, 2020. [Google Scholar]
  11. Dehning, J.; Zierenberg, J.; Spitzner, F.P.; Wibral, M.; Neto, J.P.; Wilczek, M.; Priesemann, V. Inferring change points in the spread of COVID-19 reveals the effectiveness of interventions. Science 2020, 369, eabb9789. [Google Scholar] [CrossRef]
  12. Harris, J.E. Overcoming Reporting Delays Is Critical to Timely Epidemic Monitoring: The Case of COVID-19 in New York City. MedRxiv 2020. [Google Scholar] [CrossRef]
  13. Böttcher, S.; Oh, D.-Y.; Staat, D.; Stern, D.; Albrecht, S.; Wilrich, N.; Zacher, B.; Mielke, M.; Rexroth, U.; Hamouda, O. Erfassung der SARS-CoV-2-Testzahlen in Deutschland (Stand 2.12.2020); Robert Koch-Institut: Berlin, Germany, 2020. [Google Scholar]
  14. McCulloh, I.; Kiernan, K.; Kent, T. Improved Estimation of Daily COVID-19 Rate from Incomplete Data. In Proceedings of the 2020 Fourth International Conference on Multimedia Computing, Networking and Applications (MCNA), Valencia, Spain, 19–22 October 2020; pp. 153–158. [Google Scholar]
  15. Ram, V.; Schaposnik, L.P. A modified age-structured SIR model for COVID-19 type viruses. Sci. Rep. 2021, 11, 15194. [Google Scholar] [CrossRef] [PubMed]
  16. Cooper, I.; Mondal, A.; Antonopoulos, C.G. A SIR model assumption for the spread of COVID-19 in different communities. Chaos Solitons Fractals 2020, 139, 110057. [Google Scholar] [CrossRef] [PubMed]
  17. Georgatzis, K.; Williams, C.K.I.; Hawthorne, C. Input-Output Non-Linear Dynamical Systems applied to Physiological Condition Monitoring. In Proceedings of the 1st Machine Learning for Healthcare Conference 2016: PMLR, Los Angeles, CA, USA, 19–20 August 2016. [Google Scholar]
  18. Nishiura, H.; Linton, N.M.; Akhmetzhanov, A.R. Serial interval of novel coronavirus (COVID-19) infections. Int. J. Infect. Dis. 2020, 93, 284–286. [Google Scholar] [CrossRef]
  19. Tindale, L.C.; E Stockdale, J.; Coombe, M.; Garlock, E.S.; Lau, W.Y.V.; Saraswat, M.; Zhang, L.; Chen, D.; Wallinga, J.; Colijn, C. Evidence for transmission of COVID-19 prior to symptom onset. eLife 2020, 9, e57149. [Google Scholar] [CrossRef]
  20. Böhmer, M.M.; Buchholz, U.; Corman, V.M.; Hoch, M.; Katz, K.; Marosevic, D.V.; Böhm, S.; Woudenberg, T.; Ackermann, N.; Konrad, R.; et al. Investigation of a COVID-19 outbreak in Germany resulting from a single travel-associated primary case: A case series. Lancet Infect. Dis. 2020, 20, 920–928. [Google Scholar] [CrossRef]
  21. Ganyani, T.; Kremer, C.; Chen, D.; Torneri, A.; Faes, C.; Wallinga, J.; Hens, N. Estimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data, March 2020. Eurosurveillance 2020, 25, 2000257. [Google Scholar] [CrossRef]
  22. Wölfel, R.; Corman, V.M.; Guggemos, W.; Seilmaier, M.; Zange, S.; Müller, M.A.; Niemeyer, D.; Jones, T.C.; Vollmar, P.; Rothe, C.; et al. Virological assessment of hospitalized patients with COVID-2019. Nature 2020, 581, 465–469. [Google Scholar] [CrossRef] [Green Version]
  23. Hu, Z.; Song, C.; Xu, C.; Jin, G.; Chen, Y.; Xu, X.; Ma, H.; Chen, W.; Lin, Y.; Zheng, Y.; et al. Clinical characteristics of 24 asymptomatic infections with COVID-19 screened among close contacts in Nanjing, China. Sci. China Life Sci. 2020, 63, 706–711. [Google Scholar] [CrossRef] [Green Version]
  24. Li, R.; Pei, S.; Chen, B.; Song, Y.; Zhang, T.; Yang, W.; Shaman, J. Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (SARS-CoV-2). Science 2020, 368, 489–493. [Google Scholar] [CrossRef] [Green Version]
  25. Stebegg, M.; Kumar, S.D.; Silva-Cayetano, A.; Fonseca, V.R.; Linterman, M.A.; Graca, L. Regulation of the Germinal Center Response. Front. Immunol. 2018, 9, 2469. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  26. Hao, X.; Cheng, S.; Wu, D.; Wu, T.; Lin, X.; Wang, C. Reconstruction of the full transmission dynamics of COVID-19 in Wuhan. Nature 2020, 584, 420–424. [Google Scholar] [CrossRef] [PubMed]
  27. He, X.; Lau, E.H.Y.; Wu, P.; Deng, X.; Wang, J.; Hao, X.; Lau, Y.C.; Wong, J.Y.; Guan, Y.; Tan, X.; et al. Temporal dynamics in viral shedding and transmissibility of COVID-19. Nat. Med. 2020, 26, 672–675. [Google Scholar] [CrossRef] [Green Version]
  28. Epidemiologischer Steckbrief zu SARS-CoV-2 und COVID-19. RKI 2021. Available online: https://www.rki.de/DE/Content/InfAZ/N/Neuartiges_Coronavirus/Steckbrief.html;jses- (accessed on 26 November 2021).
  29. Zhou, F.; Yu, T.; Du, R.; Fan, G.; Liu, Y.; Liu, Z.; Xiang, J.; Wang, Y.; Song, B.; Gu, X.; et al. Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: A retrospective cohort study. Lancet 2020, 395, 1054–1062. [Google Scholar] [CrossRef]
  30. Sanche, S.; Lin, Y.T.; Xu, C.; Romero-Severson, E.; Hengartner, N.; Ke, R. High Contagiousness and Rapid Spread of Severe Acute Respiratory Syndrome Coronavirus 2. Emerg. Infect. Dis. 2020, 26, 1470–1477. [Google Scholar] [CrossRef]
  31. COVID-19 National Emergency Response Center. Coronavirus Disease-19: The First 7755 Cases in the Republic of Korea. Osong Public Health Res. Perspect 2020, 11, 85–90. [Google Scholar] [CrossRef] [Green Version]
  32. Schuppert, A.; Theisen, S.; Fränkel, P.; Weber-Carstens, S.; Karagiannidis, C. Bundesweites Belastungsmodell für Intensivstationen durch COVID-19. Med. Klin.-Intensivmed. Und Notf. 2022, 117, 218–226. [Google Scholar] [CrossRef]
  33. Tolksdorf, K.; Buda, S.; Schuler, E.; Wieler, L.H.; Haas, W. Eine höhere Letalität und lange Beatmungsdauer unterscheiden COVID-19 von schwer verlaufenden Atemwegsinfektionen in Grippewellen. Epidemiol. Bull. 2020, 41. [Google Scholar] [CrossRef]
  34. Karagiannidis, C.; Mostert, C.; Hentschker, C.; Voshaar, T.; Malzahn, J.; Schillinger, G.; Klauber, J.; Janssens, U.; Marx, G.; Weber-Carstens, S.; et al. Case characteristics, resource use, and outcomes of 10,021 patients with COVID-19 admitted to 920 German hospitals: An observational study. Lancet Respir. Med. 2020, 8, 853–862. [Google Scholar] [CrossRef]
  35. Verity, R.; Okell, L.C.; Dorigatti, I.; Winskill, P.; Whittaker, C.; Imai, N.; Guomo-Dannenburg, G.; Thompson, H.; Walker, P.G.T.; Fu, H.; et al. Estimates of the severity of coronavirus disease 2019: A model-based analysis. Lancet Infectious Dis. 2020, 20, 669–677. [Google Scholar] [CrossRef]
  36. Linton, N.; Kobayashi, T.; Yang, Y.; Hayashi, K.; Akhmetzhanov, A.; Jung, S.-M.; Yuan, B.; Kinoshita, R.; Nishiura, H. Incubation Period and Other Epidemiological Characteristics of 2019 Novel Coronavirus Infections with Right Truncation: A Statistical Analysis of Publicly Available Case Data. J. Clin. Med. 2020, 9, 538. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  37. Byambasuren, O.; Cardona, M.; Bell, K.; Clark, J.; McLaws, M.-L.; Glasziou, P. Estimating the extent of asymptomatic COVID-19 and its potential for community transmission: Systematic review and meta-analysis. J. Assoc. Med. Microbiol. Infect. Dis. 2020, 5, 223–234. [Google Scholar] [CrossRef]
  38. Oran, D.P.; Topol, E.J. Prevalence of Asymptomatic SARS-CoV-2 Infection: A Narrative Review. Ann. Intern. Med. 2020, 173, 362–367. [Google Scholar] [CrossRef] [PubMed]
  39. Buitrago-Garcia, D.; Egli-Gany, D.; Counotte, M.J.; Hossmann, S.; Imeri, H.; Ipekci, A.M.; Salanti, G.; Low, N. Occurrence and transmission potential of asymptomatic and presymptomatic SARS-CoV-2 infections: A living systematic review and meta-analysis. PLoS Med. 2020, 17, e1003346. [Google Scholar] [CrossRef]
  40. Neuhauser, H.; Thamm, R.; Buttmann-Schweiger, N.; Fiebig, J.; Offergeld, R.; Poethko-Müller, C.; Prütz, F.; Santos-Hövener, C.; Sarganas, G.; Angelika, S.R.; et al. Ergebnisse seroepidemiologischer Studien zu SARS-CoV-2 in Stichproben der Allgemeinbevölkerung und bei Blutspenderinnen und Blutspendern in Deutschland (Stand 03.12.2020). Epidemiol. Bull. 2020, 50. [Google Scholar] [CrossRef]
  41. Gornyk, D.; Harries, M.; Glöckner, S.; Strengert, M.; Kerrinnes, T.; Bojara, G.; Krause, G. SARS-CoV-2 seroprevalence in Germany—A population based sequential study in five regions. medRxiv 2021. [Google Scholar] [CrossRef]
  42. Bock, W.; Jayathunga, Y.; Götz, T.; Rockenfeller, R. Are the upper bounds for new SARS-CoV-2 infections in Germany useful. Comput. Math. Biophys. 2021, 9, 242–260. [Google Scholar] [CrossRef]
  43. COVID-19-Fälle nach Meldewoche und Geschlecht sowie Anteile mit für COVID-19 relevanten Symptomen, Anteile Hospitalisierter/Verstorbener und Altersmittelwert/-median (Tabelle wird jeden Donnerstag aktualisiert). RKI 2020. Available online: https://www.rki.de/DE/Content/InfAZ/N/Neuartiges_Coronavirus/Daten/Klinische_Aspekte.html (accessed on 26 June 2022).
  44. Tagesreport-Archiv. DIVI 2020. Available online: https://www.divi.de/divi-intensivregister-tagesreport- (accessed on 26 June 2022).
  45. Delgado, G.; Safranek, J.; Goyette, B.; Spady, R. Reported versus Actual Date of Death. “Reported” versus “Actual”: Two Different Things. Available online: https://covidplanningtools.com/reported-versus-actual-date-of-death/ (accessed on 26 June 2022).
  46. Robert Koch Institute. Bericht zu Virusvarianten von SARS-CoV-2 in Deutschland, insbesondere zur Variant of Concern (VOC) B.1.1.7. Available online: https://www.rki.de/DE/Content/InfAZ/N/Neuartiges_Coronavirus/DESH/Bericht_VOC_2021-03-03.pdf?__blob=publicationFile (accessed on 3 March 2021).
  47. Kheifetz, Y.; Scholz, M. Modeling individual time courses of thrombopoiesis during multi-cyclic chemotherapy. PLoS Comput. Biol. 2019, 15, e1006775. [Google Scholar] [CrossRef]
  48. Kuhn, E.; Lavielle, M. Coupling a stochastic approximation version of EM with an MCMC procedure. ESAIM Probab. Stat. 2004, 8, 115–131. [Google Scholar] [CrossRef] [Green Version]
  49. Meineke, F.A.; Löbe, M.; Stäubert, S. Introducing Technical Aspects of Research Data Management in the Leipzig Health Atlas. Stud. Health Technol. Inform. 2018, 247, 426–430. [Google Scholar]
  50. Hale, T.; Angrist, N.; Goldszmidt, R.; Kira, B.; Petherick, A.; Phillips, T.; Webster, S.; Cameron-Blake, E.; Hallas, L.; Majumdar, S.; et al. A global panel database of pandemic policies (Oxford COVID-19 Government Response Tracker). Nat. Hum. Behav. 2021, 5, 529–538. [Google Scholar] [CrossRef] [PubMed]
  51. COVID-19 Government Response Tracker; University of Oxford: Oxford, UK. 2020. Available online: https://www.bsg.ox.ac.uk/research/research-projects/covid-19-government-response-tracker (accessed on 3 March 2022).
  52. Mullen, J.L.; Tsueng, G.; Latif, A.A.; Alkuzweny, M.; Cano, M.; Haag, E.; Zhou, J.; Zeller, M.; Hufbauer, E.; Matteson, N. Outbreak.info. A Standardized, Open-Source Database of COVID-19 Resources and Epidemiology Data. Available online: https://outbreak.info (accessed on 18 July 2021).
  53. Aktuelle Entwicklung der COVID-19 Epidemie in Leipzig und Sachsen. Bulletin 14 vom 20.02.2021. Available online: https://www.imise.uni-leipzig.de/sites/www.imise.uni-leipzig.de/files/files/uploads/Medien/bulletin_n14_covid19_sachsen__2021_02_22_v11.pdf (accessed on 26 June 2022).
  54. Scholz, S.; Waize, M.; Weidemann, F.; Treskova-Schwarzbach, M.; Haas, L.; Harder, T.; Karch, A.; Lange, B.; Kuhlmann, A.; Jäger, V.; et al. Einfluss von Impfungen und Kontaktreduktionen auf die dritte Welle der SARS-CoV-2-Pandemie und perspektivische Rückkehr zu prä-pandemischem Kontaktverhalten. 2021. Available online: https://edoc.rki.de/handle/176904/8023?show=full (accessed on 26 June 2022).
  55. Friberg, L.E.; Henningsson, A.; Maas, H.; Nguyen, L.; Karlsson, M.O. Model of chemotherapy-induced myelosuppression with parameter consistency across drugs. J. Clin. Oncol. 2002, 20, 4713–4721. [Google Scholar] [CrossRef] [PubMed]
  56. Maire, F.; Friel, N.; Mira, A.; Raftery, A.E. Adaptive Incremental Mixture Markov Chain Monte Carlo. J. Comput. Graph. Stat. 2019, 28, 790–805. [Google Scholar] [CrossRef] [Green Version]
  57. Bracher, J.; Wolffram, D.; Deuschel, J.; Görgen, K.; Ketterer, J.L.; Ullrich, A.; Abbott, S.; Barbarossa, M.V.; Bertsimas, D.; Bhatia, S.; et al. A pre-registered short-term forecasting study of COVID-19 in Germany and Poland during the second wave. Nat. Commun. 2021, 12, 5173. [Google Scholar] [CrossRef]
  58. Dempster, A.P.; Laird, N.M.; Rubin, D.B. Maximum Likelihood from Incomplete Data Via the EM Algorithm. J. R. Stat. Soc. Ser. B Methodol. 1977, 39, 1–22. [Google Scholar] [CrossRef]
  59. Tierney, L. Markov Chains for Exploring Posterior Distributions. Ann. Stat. 1994, 22, 1701–1728. [Google Scholar] [CrossRef]
  60. Haario, H.; Saksman, E.; Tamminen, J. An Adaptive Metropolis Algorithm. Bernoulli 2001, 7, 223. [Google Scholar] [CrossRef] [Green Version]
  61. Geweke, J.F. Evaluating the Accuracy of Sampling-Based Approaches to the Calculation of Posterior Moments. In Bayesian Statistics; Bernado, J.M., Berger, J.O., Dawid, A.P., Smith, A.F.M., Eds.; Clarendon Press: Oxford, UK, 1992; p. 4. [Google Scholar]
  62. Wu, Z.; McGoogan, J.M. Characteristics of and Important Lessons From the Coronavirus Disease 2019 (COVID-19) Outbreak in China: Summary of a Report of 72 314 Cases From the Chinese Center for Disease Control and Prevention. JAMA 2020, 323, 1239–1242. [Google Scholar] [CrossRef]
Figure 1. General scheme of our IO-NLDS model. The epidemiologic SECIR model is integrated as a hidden layer. Respective equations are provided in Appendix A. The input layer consists of external modifiers including parameter changes due to changes in testing policy, non-pharamceutical interventions, and age-structures. The output layer is derived from respective hidden layers via stochastic relationships (see later). The output layer is compared with real-world data. The superscript Mu denotes new virus variants.
Figure 1. General scheme of our IO-NLDS model. The epidemiologic SECIR model is integrated as a hidden layer. Respective equations are provided in Appendix A. The input layer consists of external modifiers including parameter changes due to changes in testing policy, non-pharamceutical interventions, and age-structures. The output layer is derived from respective hidden layers via stochastic relationships (see later). The output layer is compared with real-world data. The superscript Mu denotes new virus variants.
Viruses 14 01468 g001
Figure 4. Comparison of prior and posterior values of estimated parameters of the SECIR model. We present prior vs. posterior distributions of estimated parameters of the SECIR model. Ranges for priors represent assumed minimum and maximum values. Ranges for posteriors represent 95%-confidence intervals. Numbers are provided in Table A8 from Appendix J.
Figure 4. Comparison of prior and posterior values of estimated parameters of the SECIR model. We present prior vs. posterior distributions of estimated parameters of the SECIR model. Ranges for priors represent assumed minimum and maximum values. Ranges for posteriors represent 95%-confidence intervals. Numbers are provided in Table A8 from Appendix J.
Viruses 14 01468 g004
Figure 5. Relationship between estimated step function of infectivity of asymptomatic subjects and the Federal Government stringency index (GSI). The GSI [50] is a composite measure based on nine response indicators including school closures, workplace closures, and travel bans, rescaled to a value from 0 to 100 (100 = strictest). If policies vary at the level of federal states, the index is shown for the state with the strictest measures. For background info see also [51]. Colors of curves correspond to different y-axes.
Figure 5. Relationship between estimated step function of infectivity of asymptomatic subjects and the Federal Government stringency index (GSI). The GSI [50] is a composite measure based on nine response indicators including school closures, workplace closures, and travel bans, rescaled to a value from 0 to 100 (100 = strictest). If policies vary at the level of federal states, the index is shown for the state with the strictest measures. For background info see also [51]. Colors of curves correspond to different y-axes.
Viruses 14 01468 g005
Figure 6. Comparison of predicted and observed decline of the second wave in Saxony according to the initiated lock-down. Our model was used to fit the observed data until 21 December 2020 (shown as grey curve (raw data) and black curve (smoothed) of reported test-positives). Estimated step functions b1 and b2 describing the infectivity of asymptomatic and symptomatic subjects were reduced by 0% (yellow: no lock-down = control scenario), 20% (green: pessimistic scenario), 40% (blue: realistic scenario), and 60% (magenta: optimistic scenario) to simulate four scenarios of the future course of the epidemic under lock-down conditions. The observed numbers of test-positives after the 21 December 2020 are shown in red (light red = raw data, dark red = smoothed) closely followed the expected scenario of 40% lock-down efficacy. Shaded areas represent 95% prediction intervals. The predictions and parameters were reported in our regular bulletin deposited at Leipzig Health Atlas, ID: 85AH9JMUFM-4.
Figure 6. Comparison of predicted and observed decline of the second wave in Saxony according to the initiated lock-down. Our model was used to fit the observed data until 21 December 2020 (shown as grey curve (raw data) and black curve (smoothed) of reported test-positives). Estimated step functions b1 and b2 describing the infectivity of asymptomatic and symptomatic subjects were reduced by 0% (yellow: no lock-down = control scenario), 20% (green: pessimistic scenario), 40% (blue: realistic scenario), and 60% (magenta: optimistic scenario) to simulate four scenarios of the future course of the epidemic under lock-down conditions. The observed numbers of test-positives after the 21 December 2020 are shown in red (light red = raw data, dark red = smoothed) closely followed the expected scenario of 40% lock-down efficacy. Shaded areas represent 95% prediction intervals. The predictions and parameters were reported in our regular bulletin deposited at Leipzig Health Atlas, ID: 85AH9JMUFM-4.
Viruses 14 01468 g006
Figure 7. Simulation of third wave scenarios for Saxony/Germany: Upper row: The model was used to fit all observed data until 14 February 2021 (grey curve = raw data of reported testpositives, black curve = smoothed). Three scenarios were simulated differing in assumed initial proportion of B.1.1.7 which was not exactly known at this time point (10%, 20%, and 30%, respectively) and in the assumptions regarding increased virulence of B.1.1.7 (parameter mur = 1.7, 1.8 and 2, respectively). Predicted course of subjects infected with the respective variants are shown as shaded areas. The observed total numbers of testpositives (light red = raw data of reported testpositives, dark red = smoothed curve) closely followed the pessimistic scenario 3. Lower row: When comparing the proportion of B.1.1.7 as retrieved from [52] from 18 July 2021, initial proportion of B.1.1.7 was indeed close to that assumed for scenario 3. Blue curves represent 95% confidence intervals of the B.1.1.7 proportion predicted for the different scenarios. All predictions were reported in our bulletin at the 20 February 2021 deposited at our Leipzig Health Atlas [53].
Figure 7. Simulation of third wave scenarios for Saxony/Germany: Upper row: The model was used to fit all observed data until 14 February 2021 (grey curve = raw data of reported testpositives, black curve = smoothed). Three scenarios were simulated differing in assumed initial proportion of B.1.1.7 which was not exactly known at this time point (10%, 20%, and 30%, respectively) and in the assumptions regarding increased virulence of B.1.1.7 (parameter mur = 1.7, 1.8 and 2, respectively). Predicted course of subjects infected with the respective variants are shown as shaded areas. The observed total numbers of testpositives (light red = raw data of reported testpositives, dark red = smoothed curve) closely followed the pessimistic scenario 3. Lower row: When comparing the proportion of B.1.1.7 as retrieved from [52] from 18 July 2021, initial proportion of B.1.1.7 was indeed close to that assumed for scenario 3. Blue curves represent 95% confidence intervals of the B.1.1.7 proportion predicted for the different scenarios. All predictions were reported in our bulletin at the 20 February 2021 deposited at our Leipzig Health Atlas [53].
Viruses 14 01468 g007
Table 1. Description of model compartments. We describe the compartments of the model and their biological meaning. Compartments E, I A , and I S are duplicated to account for two concurrent virus variants.
Table 1. Description of model compartments. We describe the compartments of the model and their biological meaning. Compartments E, I A , and I S are duplicated to account for two concurrent virus variants.
Compartment NameSub-CompartmentsDescription
S c Susceptible
E Latent stage (not infectious)
I A I A , 1 Asymptomatic infected state 1, can either develop symptoms, i.e., transit to I S , 1 with probability p s y m p and rate r4b or stays asymptomatic with probability 1 p s y m p and transits to I A , 2 with rate r4
I A , 2 Asymptomatic infected state 2, transits to I A , 3 with rate r4
I A , 3 Asymptomatic infected stage 3 transits to R with rate r4
I S I S , 1 Symptomatic infected state 1.
Can either become critical, i.e., transits to C 1 with probability p c r i t and rate r6 or stays sub-critical with probability 1 p c r i t and transits to I S , 2 with rate r5
I S , 2 Symptomatic infected state 2, can either die, i.e., transits to D with probability p d e a t h , S or transits to I S , 3 with probability 1 p d e a t h , S and rate r5
I S , 3 Symptomatic infected state 3, transits to R with rate r5
C C 1 Critical state 1, not infectious.
Can either die, i.e., transits to D with probability p d e a t h and transit rate r8 or stays critical with probability 1 p d e a t h and transits to C 2 with rate r7
C 2 Critical state 2, transits to C3 with rate r7
C 3 Critical state 3, transits to R with rate r7
R Recovered (absorbing state)
D Dead (absorbing state)
Table 2. Basic model parameters. We present prior values and ranges derived from the literature as well as estimated values derived from parameter fitting. Transit rate means reverse of transit time of the respective compartment. Posteriors can be found in Figure 4. §: Further details and definitions on parameters are given in the Appendix A Equations (A1) and (A2), where also a justification of priors is provided, (Appendix H).
Table 2. Basic model parameters. We present prior values and ranges derived from the literature as well as estimated values derived from parameter fitting. Transit rate means reverse of transit time of the respective compartment. Posteriors can be found in Figure 4. §: Further details and definitions on parameters are given in the Appendix A Equations (A1) and (A2), where also a justification of priors is provided, (Appendix H).
ParameterUnitDescriptionSourceReferenceValuePrior Mean MinMax
influxSubjects per dayInitial influx of infections into compartment E until first interventionsEstimated§3171---
r 1 Day−1Infection rate through asymptomatic subjectsEstimated§1.19---
r 2 Day−1Infection rate through symptomatic subjects Set   equal   to   r b 1 , 2 · r 1 (parsimony) §0.451---
r b 1 , 2 -Proportion of infection rate symptomatics/asymptomatics r1/r2Estimated§0.379-0-
r 3 Day−1Transit rate for compartment E (latent time)prior constraint§, [10,18,19,20,21] 0.2721/31/41/2
r 4 Day−1Transit rate for asymptomatic sub-compartments prior constraint§,[22,23,24,25]0.6363/53/103/4
r 4 , b Day−1Rate of development of symptoms after infectionprior constraint§, [10,18,19,20,21,26,27,28] 0.456 2/551/51
r 5 Day−1Transit rate for symptomatic sub-compartmentsprior constraint§0.9466/56/156/3
r 6 Day−1Rate of development of critical state after being symptomaticprior constraint§, [10,29,30,31]0.186 1/51/71/4
r 7 Day−1Transit rate for critical state sub-compartmentprior constraint§,[10,32,33,34] 0.1593/173/353/8
r 8 Day−1Death rate of patients in critical sub-compartment 1prior constraint§, [29,35,36]0.1041/81/142/13
p s y m p -Probability of symptoms development after being infectedSet or prior constrained (overfitted if estimated unconstrained)§,[37,38,39] 0.5-0.30.8
p c r i t   ( p c r i t , 0 ) -Initial value p c r i t , 0 of step function p c r i t , the probability of becoming critical after developing symptomsEstimated§, [9,27]0.0765-01
p d e a t h ( p d e a t h , 0 )-Initial value p d e a t h , 0 of step function p c r i t , the probability of dying after becoming criticalEstimated§, [32]0.119-01
p d e a t h , S -Probability of death after developing symptoms without becoming critical Set equal to
p d e a t h , S , 0   · p d e a t h (parsimony)
§, [32]--01
p d e a t h , S , 0 Proportionality factor for probability of death after developing symptoms without becoming criticalEstimated§0.587
P S , M -Fraction of unreported cases prior constraint§, [40,41]0.499 0.50.10.90
mur Ratio of r1Mu/r1 = r2Mu/r2 reflecting higher infectivity of B.1.1.7 variant Set§1.65---
Table 3. Parameters used to define the input layer. These parameters were used to empirically model changing NPIs or changing contact behavior, changes in testing policies and changing age-structures during the course of the epidemic. Respective input functions constitute the input layer of our IO-NLDS model.
Table 3. Parameters used to define the input layer. These parameters were used to empirically model changing NPIs or changing contact behavior, changes in testing policies and changing age-structures during the course of the epidemic. Respective input functions constitute the input layer of our IO-NLDS model.
ParameterUnitDescriptionSourceRemarks
N t r -Number of time points of changes of NPI/contact behaviorEmpirically defined13 intensifications, 15 relaxations identified
(determined by information criterion)
b t r , j , j = 1,…,   N t r -Relative infectivity of subjects in the time interval [tr, tr + 1]Estimatedassumed to be the same for symptomatic and asymptomatic patients
T r j , j = 1,…,   N t r DaysTime points of NPI/contact behavior changesEstimated or fixed Strictly monotone sequence
N c r i t -Number of time steps of p c r i t t Empirically defined18 (determined by information criterion)
α c r i t , j , j = 1,…,   N c r i t -Value of p c r i t between two time stepsEstimatedWithin the interval [0, 1]
T p c r i t , j , j = 1,…,   N c r i t DaysTime points of changes of p c r i t Estimated Strictly monotone sequence
N d e a t h -Number of time steps of p d e a t h t Empirically defined19 (determined by information criterion)
α d e a t h , j , j = 1,…,   N d e a t h -Value of p d e a t h between two time stepsEstimatedWithin the interval [0, 1]
T p d e a t h , j , j = 1,…,   N d e a t h DaysTime points of changes of p d e a t h t Estimated Strictly monotone sequence
D e l t r DaysDelay of activation of NPIFixed2 days
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Kheifetz, Y.; Kirsten, H.; Scholz, M. On the Parametrization of Epidemiologic Models—Lessons from Modelling COVID-19 Epidemic. Viruses 2022, 14, 1468. https://doi.org/10.3390/v14071468

AMA Style

Kheifetz Y, Kirsten H, Scholz M. On the Parametrization of Epidemiologic Models—Lessons from Modelling COVID-19 Epidemic. Viruses. 2022; 14(7):1468. https://doi.org/10.3390/v14071468

Chicago/Turabian Style

Kheifetz, Yuri, Holger Kirsten, and Markus Scholz. 2022. "On the Parametrization of Epidemiologic Models—Lessons from Modelling COVID-19 Epidemic" Viruses 14, no. 7: 1468. https://doi.org/10.3390/v14071468

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop