Next Article in Journal
A Note on Some Reduction Formulas for the Incomplete Beta Function and the Lerch Transcendent
Next Article in Special Issue
A Mathematical Model for Controlling Exchanged Spinor Waves between Hemoglobin, Tumor and T-Cells
Previous Article in Journal
Special Functions as Solutions to the Euler–Poisson–Darboux Equation with a Fractional Power of the Bessel Operator
Previous Article in Special Issue
A Comparative Study between Discrete Stochastic Arithmetic and Floating-Point Arithmetic to Validate the Results of Fractional Order Model of Malaria Infection
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Expert System to Model and Forecast Time Series of Epidemiological Counts with Applications to COVID-19

by
Beatriz González-Pérez
1,2,*,
Concepción Núñez
3,
José L. Sánchez
1,
Gabriel Valverde
1 and
José Manuel Velasco
4
1
Department of Statistics and Operations Research, Complutense University of Madrid (UCM), 28040 Madrid, Spain
2
Interdisciplinary Mathematics Institute (IMI), Complutense University of Madrid (UCM), 28040 Madrid, Spain
3
Laboratory of Research in Genetics of Complex Diseases, Hospital Clínico San Carlos, IdISSC, 28040 Madrid, Spain
4
Computer Architecture and Automation Department, Complutense University of Madrid (UCM), 28040 Madrid, Spain
*
Author to whom correspondence should be addressed.
Mathematics 2021, 9(13), 1485; https://doi.org/10.3390/math9131485
Submission received: 28 May 2021 / Revised: 17 June 2021 / Accepted: 21 June 2021 / Published: 24 June 2021
(This article belongs to the Special Issue Mathematical Modeling and Its Application in Medicine)

Abstract

:
We developed two models for real-time monitoring and forecasting of the evolution of the COVID-19 pandemic: a non-linear regression model and an error correction model. Our strategy allows us to detect pandemic peaks and make short- and long-term forecasts of the number of infected, deaths and people requiring hospitalization and intensive care. The non-linear regression model is implemented in an expert system that automatically allows the user to fit and forecast through a graphical interface. This system is equipped with a control procedure to detect trend changes and define the end of one wave and the beginning of another. Moreover, it depends on only four parameters per series that are easy to interpret and monitor along time for each variable. This feature enables us to study the effect of interventions over time in order to advise how to proceed in future outbreaks. The error correction model developed works with cointegration between series and has a great forecast capacity. Our system is prepared to work in parallel in all the Autonomous Communities of Spain. Moreover, our models are compared with a SIR model extension (SCIR) and several models of artificial intelligence.

1. Introduction

The coronavirus disease 2019 (COVID-19), caused by the so-called SARS-CoV-2 virus, has spread throughout the world leading to a terrible pandemic. Starting in China in December 2019, the following two countries were Italy and Spain. The infection’s high transmissibility led some regions to suffer a special impact. Such is the case of the Autonomous Region of Madrid in Spain. On March 11, the World Health Organization (WHO) declared COVID-19 a pandemic. On the same date, Madrid was already in an extremely serious situation and all educational centers were closed. Three days later, on March 14, the state of alarm was declared throughout the country. On March 30, freedom of activity outside the home was reduced to essential services. These conditions were relaxed on April 13, with the permission of some non-essential activities. From April 26, children could go outside for 1 h every day, and one week later this measure was extended to the general population. Measures of varying magnitude have been taken in the following three waves that have followed one after the other so far.
Mathematical models to track changes in the behavior and patterns of infection appear to be essential tools for making future decisions.
Some authors (see, e.g., [1,2]) were pioneers in collecting solutions based on artificial intelligence and expert knowledge, among the thousands of articles published on the subject.
Artificial intelligence is useful to monitor the evolution of the pandemic even in real time, either through expert systems or with a predictive approach based on machine learning. Researchers have been able to validate the effectiveness of these models with different illnesses. For example, through a dynamic neural network, it was possible to understand the evolution of Zika (see [3]). The same has happened with Ebola or the common flu. Currently, models are being retrained with new data related to the COVID-19 (see [4]). Abhari et al. [5] used a previously developed agent-based artificial intelligence simulation platform (EnerPol) coupled with big data.
It is worth highlighting the need for artificial intelligence tools to be easy to use by those who want to operate with them. This is why, around the idea of monitoring and forecasting, projects have been generated to visualize the information collected. With this approach, in [6], we can find an ordered list of the most interesting sites with dashboards: UpCode, NextStrain, CSSE (Johns Hopkins), Thebaselab, the BBC, the New York Times, HealthMap and COVID-19 Tracker (Microsoft).
The SIR model (“Susceptible”, “Infected”, “Recovered”) and its extensions are traditional epidemiological models. They are compartmental models of differential equations that relate the variations of different population groups (compartments) through the infection rate and the average infectious period. Most recent studies are based on modifications of the SIR model (see, e.g., [7,8,9]). The underlying idea is to model the waves of a pandemic as exponential increases and decreases to the left and right of a peak of maximum incidence. In Spain, it is worth highlighting the work carried out by the Interdisciplinary Group of Complex Systems at UC3M [10] and the work carried out by the MOMAT Group at UCM [11]. For comparisons purposes, we implement the SIR model extension developed by Castro et al. [10], SCIR: a SIR model with “confinement”.
In this paper, we approach the problem from another point of view: non-linear regression predictive models. Researchers from the Andalusian School of Public Health of Granada have developed a predictive model of the COVID-19 epidemic in Spain with an adjustment to a Gompertz curve [12]. For the adjustment of the Gompertz curve to the observed accumulated data of cases and deaths, they used the Nelter–Mead algorithms [13] implemented by Nash [14]. The software used for the calculations was R via drc package. Our strategy extends this approach by allowing greater flexibility in fitting to Gompertz curves, especially in the distribution tails. Another Gompertz approximation was proposed by Catala et al. [15]. Our expert system automatically chooses the best fit from a variety of models, including the Gaussian, double exponential and double Pareto curves. In addition, all programming, the optimization algorithm and the heuristic are original.
Moreover, we develop an Error Correction Model (ECM). This approximation belongs to a category of multiple time series models for data where the underlying variables have a long-run common stochastic trend.
Our research group registered in the “Mathematical action against coronavirus”, a cooperative prediction initiative of the Spanish Mathematics Committee (CEMat). (A meta-predictor has been built to provide authorities with information on the short-term behavior of variables of great interest in the spread of the COVID-19 virus. The method uses the predictions from different models/algorithms, provided by the participating researchers, and constructs optimized combinations of them, disaggregated by Autonomous Communities.) Within this initiative, we have participated together with other research groups in the “Cooperative Prediction” action [16], providing daily predictions with our preliminary model since March 2020, during the entire first wave of the pandemic. All models that participate in the construction of the metapredictor developed by the cooperative prediction action promoted by the CEMat have been validated continuously since the beginning of the pandemic.
The results obtained in this paper are reproducible using the code from our public repository. The code for the developed graphical interface that allows the user to interact with our system is also included in: https://github.com/mikiNadal/covid19_article_reproducible (accessed on 22 June 2021).
Section 2 introduces the non-linear regression model. Section 3 introduces the error correction model. Section 4 introduces the SCIR model. Section 5 compares the three models with different metrics. Finally, the conclusions are presented in Section 6.

2. Non-Linear Regression Model

We aim to develop a theoretical framework that allows us to detect peaks and make short- and long-term monitoring and forecasting of the number of people infected, people requiring hospitalization and deaths during an infectious disease. With short-term prediction, we refer to the task that we performed for the CEMat during the first wave, consisting of giving predictions every day with a horizon of 8 days. From the second wave, we were asked for predictions every week with a horizon of 14 days. With long-term forecast, we refer to the prediction of the peak, the total number of infected at the end of a wave under study and giving commitment dates for which only a small percentage of the area under the model remains. These values are monitored day by day and are an indication, for example, of when a wave is exhausted.
This model is implemented with an expert system of artificial intelligence based on non-linear regression and is extremely useful to estimate the effectiveness of the interventions prompted by the governments and to advise on how to proceed in future outbreaks. Furthermore, the machine learning algorithm developed allows parallel running and introduction of new data in real time, and it is scalable.
Our model is based on directly estimating the distribution function of each of the series under study and on the duality between the distribution function and the density function. Since those two functions fully characterize the probability distribution of a continuous variable, our model is able to capture the main characteristics of epidemic outbreaks. To this, we can add its simplicity, since it is formulated only through three parameters. Hereinafter, we refer to our first proposed epidemiological model as the MATGEN model in honor of our group enrolled in the Mathematical action against coronavirus [16], an initiative of CEMat (Spanish Mathematics Committee).

2.1. The Model

The notation employed in this work is as follows:
Let the well-known density function of a normal variable of mean be μ and variance σ 2 , N ( μ , σ ) (see Figure 1a), given by
f t = 1 σ 2 π e 1 2 t μ σ 2
and both its distribution function and the right tail as follows:
F t = t 1 σ 2 π e 1 2 x μ σ 2 d x
1 F t = t 1 σ 2 π e 1 2 x μ σ 2 d x
Note that the density peak of N ( μ , σ ) is reached in μ , and it is a point of inflection of its distribution function. Furthermore, it verifies that F t = 1 F t = 0.5 .
For a review of the properties of the distribution function, the density function, the characteristic function of a random variable and the relationships between them, see the work of Quesada and Pardo [17].
Data series of COVID-19 in Spain include day by day the cumulative number of people infected, people requiring hospitalization and deaths. These data can be downloaded from ISCIII [18].
We denote both the relative and cumulative frequencies at time t as follows:
N t cumulative   per   day ,
n t = N t N t 1 new   cases   per   day ,
f t = n t n ,
F t = i = 1 t f i ,
where n is the total number of cases at the end of the pandemic.
Furthermore, we introduce the average of the cumulative frequencies at time t given by
A v t = 1 t i = 1 t F i .
In this context, we work with the following non-linear regression model:
F i = F i + ε i , ε i N 0 , τ , i = 1 , , t ,
where the parameters n, μ and σ are estimated by the least squares method.
For an introduction to frequentist and bayesian regression, see the work of Gómez Villegas [19].

2.2. Other Wave Models

The next subsections detail a basic guide for the correct implementation of the least squares method and the algorithm designed for the detection of pandemic peaks.

2.3. The Algorithm: Peak Detection

Initialize t in t 0 , the current moment.
Compute the mean squared error as follows:
E C M t , n , μ , σ | n 1 , , n t = 1 t i = 1 t F ( i ) F i 2 ,
the total variance
S C T t , n , μ , σ | n 1 , , n t = 1 t i = 1 t F i A v t 2
and the coefficient of determination
R 2 t , n , μ , σ | n 1 , , n t = 1 E C M t , n , μ , σ | n 1 , , n t S C T t , n , μ , σ | n 1 , , n t .
In statistics, the coefficient of determination is the proportion of the variance in the dependent variable F i that is predictable from the independent variable F ( i ) . It is a statistic used in the context of goodness of fit and provides a measure of how well observed outcomes are replicated by the model, based on the proportion of total variation of outcomes explained by the model. This coefficient takes values between 0 and 1, and, between two models, the one with the highest determination coefficient is preferred. Furthermore, with this criterion, the best model is the one that maximizes the coefficient of determination within a plausible family of models:
max n , μ , σ R 2 t , n , μ , σ | n 1 , , n t .
On the other hand, we control at the same time the adjustment of the observed frequencies by means of the theoretical density function. To this end, the criterion that we follow is to perform a linear regression
f i f i = a + b i + e i , e i N 0 , ν , i = 1 , , t ,
and to introduce the constraint
p v a l u e F o b s = P F 1 , t 2 > F o b s > 0.1 ,
where F o b s is the observed value of the test statistic for testing H 0 : b = 0 vs. H 1 : b 0 and F 1 , t 2 is its theoretical distribution under H 0 , that is a Snedecor’s F distribution with 1 and t 2 degrees of freedom.
We propose to solve the multicriteria optimization problem by obtaining the values of n t , μ t and σ t , so that
max n , μ , σ R 2 t , n , μ , σ | n 1 , , n t ,
under the constraint
p v a l u e F o b s = P F 1 , t 2 > F o b s > 0.1 .
Now, stop if F t = F t = 0.5 , otherwise incorporate the data t 1 , do t = t 1 and repeat.
In parallel, fit a model for each of the series, namely the number of new positive cases per day, the number of new deaths per day and the number of new ICU admissions per day, and choose the model that simultaneously maximizes the three values of R 2 .
Stop when F t = F t = 0.5 is accomplished in all three series.
It is important to note that the algorithm allows the introduction of new data in real time, and it is scalable.

2.4. The Algorithm: Commitment Dates

Let t p be the first day that F t p = F t p = 0.5 , and n t p , μ t p and σ t p are the optimal values of the parameters at that time point.
If f t p + 1 f t p , do μ = t p + 1 and compute n and σ so that F t p + 1 = F t p + 1 = 0.5 .
Otherwise, determine the value of t m a x so that f t m a x = max t t p f t , do μ = t m a x and compute n y σ l e f t so that F t m a x = F t m a x = 0.5 and σ r i g h t fit the series for t t m a x .
Finally, the percentiles q n o r m 0.99 and q n o r m 0.999 of the normal probability distribution N ( t m a x , σ r i g h t ) are the commitment dates to lift the restrictions from least to most conservative.

2.5. The Heuristic

To perform an effective optimization, we opt for an ambitious heuristic that we detail below.
Let σ = σ 0 , starting at σ 0 = 15 .
Move σ between σ 0 14 and σ 0 + 14 .
At this point, it is important to note that the incubation period of the disease is between 2 and 14 days (see [20]). In addition, the delay between the time of infection and the report as a positive case is considered.
Let μ = μ 0 , starting at μ 0 = t 0 , t 0 being the current moment.
Move μ between the first day of each of the series and t 0 + 2 σ 0 . For example, the first day of the series of the number of people infected in the Region of Madrid is Day 26, which corresponds to February 25.
Generate k = 10,000 values of a uniform random variable between 0 and 1.
Compute n = N p for each value p generated in the last step; N = approximately 6,550,000 in the Region of Madrid.
Discard the values of n < N t 0 .
Find the feasible models with p-value > 0.1 for the noise and select the one with the largest R 2 of the fit in the cumulative frequencies.
In practice, the running of the heuristic generates a .csv file that contains several columns. The columns corresponding to the fitted parameters, μ , σ and n, the coefficient of determination and the p-value are included. Moreover, two columns are added to register every day the moment of the real peak, which corresponds with the day with the highest frequency observed to date, and the day when the cutoff between the models fitted to both sides is observed, i.e., when the distribution becomes positively skewed. The algorithm tries to match the value of the real peak, the cutoff and the parameter μ . It also allows fitting a different σ to the left and right of the cutoff. The last two columns include the commitment dates corresponding to percentiles q n o r m 0.99 and q n o r m 0.999 of the fitted model to the right.
In the next subsections, we present the results that are obtained from the run of the previous algorithm programmed through our expert system. To do this, we consider the data series of COVID-19 in Spain, which are published in [18]. Specifically, we study the case of the Region of Madrid.

2.6. COVID-19 Data Sets

It is necessary to consider the time required to test the presence of the infection and obtain a report to test positive for the virus. This is especially relevant when there are problems with access to care and with bottlenecks in laboratory testing. At some moments, this led the health system in Madrid to test only people with severe symptoms. In addition, the delay of up to several weeks in the notification of positive cases by the laboratories led to changes in the data history depending on the day the data were downloaded (see Table 1 and Table 2).
Even when the data come from official sources, they may present inconsistencies that must be taken into account. The portal to access the European Union open data [21] publishes data on the evolution of COVID-19 by continent and broken down by country. It can be verified that only positive PCRs are counted in the series of cases on this portal. On the other hand, Spain (see [18]) and Italy (see [22,23,24]) offer more detailed information through their national institutional portals. For example, on April 17, Spain introduces two columns, PCR and TestAc, and TestAc is empty until April 18, when the government introduces this type of test into the count. On the recommendation of the Spanish Mathematics Committee, we chose to consider confirmed cases as PCR+TestAc. This situation has been remedied since the second wave of the pandemic.
Another controversial point is the notification of deaths due to COVID-19 and the real deaths registered by the undertaking services of the Autonomous Communities of Spain. This suggests that only a percentage of the real deaths due to COVID-19 were recorded. As a matter of fact, many elderly people died in nursing homes before being tested for COVID-19 and in most cases they were not included in the number of deaths.
One more problem that we have had to face is related to the series of ICUs in the Region of Madrid, in which we found an anomaly. Since April 28, the Region of Madrid offers cumulative data on the number of people with coronavirus who have gone through the ICU. Before this date, the data were those of daily occupation. Furthermore, the number of ICU beds varied throughout the course of the epidemic. The maximum number was changing due to the increase in ICU beds in the large hospitals in Madrid and especially the provision of new hospital beds in the IFEMA hospital.

2.7. Expert System

Our expert system was designed to facilitate obtaining the results and it consists of several parts. First, it allows downloading and updating the data file in real time every day. Once the data file has been updated, the expert system allows us to run the algorithm for all the series in parallel or one in particular, as well as for all the provinces of Spain or a specific one. Once the algorithm has been run, our expert system returns three reports:
  • A .pdf file with three graphs for the current time: one of the fitted distribution function, one of the fitted density function and one of the adjustment to white noise. Figure 2a,b shows these reports for cases and deaths in the Region of Madrid.
  • A .csv file (see Tables 3 and 5) with the results of the entire day-to-day history of the process, from which the following can be extracted: (i) the optimal parameters; (ii) the coefficient of determination of the fit to the distribution function; and (iii) the p-value of the fit to white noise of the relative errors of the fit to the density function. In addition, this .csv file also contains the day-to-day commitment dates.
  • A .csv file (see Table 6) with an 8-day horizon of the forecast made with the fitted model and a graph with the future model.

2.8. New and Cumulative Confirmed Cases per Day Series

Figure 2 and Table 3, Table 4, Table 5 and Table 6 summarize the results for the new and cumulative confirmed cases per day in the Region of Madrid.
The real peak of the series is one of the most difficult values to predict. Some days after it has taken place, it is easy to know that the peak was already reached on March 26. Table 3 and Figure 2 indicate that μ matches the value of the real peak between April 17 and 21. However, as the curve of μ variation per day (see Figure 2) begins to flatten from March 20, the model alerts us in advance of the possibility that the real peak appears at any time after that date.
In a virus-free transmission situation, the model would fit to a perfect Gaussian distribution and μ would be equal to the real peak. Therefore, small deviations from the model to the left of the real peak may indicate a change in the evolution of the virus. For example, between March 8 and 14, the curve of the relative frequencies (blue) is above the model fitted for the density function (orange) (see Figure 2a), which may indicate the dangerous situation present in Madrid before March 8. This dangerous situation could be a consequence of individual events increasing close contact between people. This was already evident in the fitted models until this date. The period prior to March 11 can be considered free of disease transmission because interventions have not been applied. Nevertheless, that virus-free model is observed several days after that date. In fact, on March 17, there is a small peak of cases, which could be due to a high number of contagions during different massive events in Madrid on March 8. On March 22, the measures imposed by the government started to be noticed, since the observed data lie below the fitted model, and that situation continues until the global peak of cases is reached around March 26. On March 30, after the peak of cases, freedom of activity outside the home was reduced to essential services. The effect of this intervention is noted on April 9, when the real data lie below the fitted model, and that trend remains until April 12. It seems that interventions take around 11 days to be noted.
It is important to note that the parameter σ of the model changes to the right of the real peak of cases. This indicates that the containment measures are in fact effective. This results in an increase in the variance of the model to the right of the turning point μ , which indicates a slowdown in infections (see Figure 2). On April 18, with the addition of the column TestAc to the dataset, an explosion in the graph of the density of cases is observed (see Figure 2a). Except for this incident, the model remains quite stable to the right of the peak of cases and commitment dates for the progressive lifting of mobility restrictions can be proposed, as Figure 2 and Table 4 show. For example, May 11 shows commitment dates between May 19 and June 5 on the basis of 0.01 and 0.001 for the right tail area of the model (with a forecast of 72,200 for the total of confirmed cases at the end of the pandemic).
This report concludes with a forecast for cumulative cases over the 8-day horizon. For example, Table 6 shows the forecast from May 11 to 18 generated on May 11.

2.9. New and Cumulative Deaths per Day Series

Figure 2b summarizes the results for the new and cumulative deaths per day series in the Region of Madrid.
A similar analysis to the one explained for cases can be done for deaths. Some days after the real peak of deaths has taken place: it is easy to know that is was already reached on March 28. The peak of the model, μ , was reached around April 1 (see Figure 2b).
The update on May 11 shows two early peaks on March 14 and 16. This situation was followed by an increase with respect to the fitted model between March 22 and 28 and then between April 2 and 9. Between these two periods of time, the situation of ICUs is dramatic, as explained in the following subsection.
It is important to highlight that the model changes to the right side due to the effectiveness of the containment measures. This translates into an increase in the variance of the model on the right side, which indicates a slowdown in deaths. On May 11, the fitted model forecasts a total of around 9000 deaths at the end of the pandemic.

2.10. New and Cumulative ICUs per Day Series

Figure 2c summarizes the results of new and cumulative ICUs in the Region of Madrid.
Although the official data are confusing, the fitted model for new and cumulative ICUs (to fit a suitable model to the series of ICUs, it was necessary to modify the algorithm considering two uniform and one exponential models to adequately describe the consecutive situations of plateau to the right of the normal model) reveals that the median of the model occurred between March 24 and 25 (see Figure 2c). It is important to note that the cumulative frequencies of new ICUs evolve very slowly, as the first graph in Figure 2c shows. It can be seen that 10% of the probability distribution of the model remains after May 11.
Taking into account that the duration of ICU stay depends on each patient and it usually ranges between 8 and 28 days, one can understand the saturation of the ICUs in the Region of Madrid.
The real peak of the series is one of the most difficult features to forecast. It may have occurred after the peak of deaths on March 28. The date when more new cases were incorporated into ICUs was March 20 with 205. This situation is detected with the value of the parameter μ of the normal model fitted to the left, which corresponds to March 21 (see the second graph of Figure 2c). However, that date does not correspond to the real peak because ICUs became saturated. On April 2, the reported number of ICUs reached the highest value: 1528. The two dates cited are around the worst moments in terms of numbers of deaths.
On May 11, the fitted model forecasts that a total of 4000 people will have gone through the ICU at the end of the pandemic.

2.11. Second Wave

After the first wave, the format in which the data were provided changed and their quality increased, although the series continues to change from one day to the next.
Figure 3 shows the fitted model for Madrid on December 12, for the second wave of all series, cases, deaths, hospitalizations and ICU admissions. Figure 4 shows the monitoring of the parameters: μ , σ l e f t , σ r i g h t and n. Note how the effect of the interventions is manifested in a preview of the peak, in the jump that σ l e f t experiences with respect to σ r i g h t and in the stabilization of n after reaching the peak. In addition, the peak is predicted in the future from the end of August.
Figure 5 and Figure 6 shows the fitted models of the second wave for Asturias. These figures make evident the usefulness of testing different regression functions. Unlike in Madrid, the expert system selects as best fits those made with a double exponential or double Pareto model.
During the third and fourth waves, we incorporated the error correction model into our expert system that increased our predictive capacity. This is the model with which we currently give our 14-day predictions to the Spanish Mathematics Committee. We found in the comparison tool implemented on the website of this initiative that this new model remains among the top three over time with respect to all error metrics. Another of its advantages is that it adjusts the four data series at the same time: cases, deaths, hospitalized and ICUs.

3. Error Correction Model

3.1. Model Definition

It is intuitive to think that a peak in the confirmed series will increase the other three data series with some delay. This type of relationship is usually modeled including the displacement of the confirmed series p times to the future and using this feature to predict the present of other series. This method for one series is called auto-regressive model and is generalized to more than one series in the vector auto-regressive model. This type of model needs the time series to be stationary, which means that the process has the first two moments constant along time. This restriction is a problem in this case, where we want to predict future changes in trends.
When one has a set of stationary time series, y t = ( y 1 t , , y K t ) , one can define the stable auto-regressive vector model of order p as:
y t = ν + A 1 y t 1 + + A p y t p + u t
where A i are matrices of parameters, ν = ( ν 1 , , ν K ) represents the mean of each time series and u t represents the random error term of the model, with u t N ( 0 , Σ u ) .
The usual procedure when working with this model with non-stationary series is to differentiate them until they become stationary, but this procedure clouds the inference about the model.
There will be a long-term relationship. The stochastic trend of the series will be shared between the series as they will go down and up in the same way, keeping a certain distance between them.
We can say that the set of time series y t have a long equilibrium if there exists a vector β such that β y t = β 1 y 1 t + + β K y K t = 0 and define the process z t = β y t as the deviations from this relation.
Thus, the series in the set y t are said to be cointegrated if y i t is a series of order 1 for all i = 1 , , K and there exists a vector β such that z t = β y t is a stationary process.
With this relationship in mind, it is possible to define a model based on the vector auto-regresive for this type of data, which is the error correction model and is given by
Δ y t = Π y t 1 + Γ 1 Δ y t 1 + + Γ p 1 Δ y t p + 1 + ϕ d t + u t
= α β y t 1 + Γ 1 Δ y t 1 + + Γ p 1 Δ y t p + 1 + ϕ d t + u t
where Γ i are matrices of parameters, d t is the deterministic part of the model, u t N ( 0 , Σ u ) and Π = α β with α the loading matrix and β the cointegration matrix.
Integrated and cointegrated systems must be interpreted cautiously. A tool to make inferences about the model is the impulse response function. This tool describes the evolution of the model variables in reaction to a shock in one or more of them. The impulse response function is developed for VAR models, but we can express the error correction model as a VAR model, as mentioned in [25].
For the application of the error correction model to the COVID-19 data, the following modifications are made:
  • It is first necessary to apply the logarithm to each of them; this is because seasonality is multiplicative while the model is additive, so it is necessary to transform this relationship and capture it with the proposed model.
  • The data series show a strong seasonality due to the data collection and publication policy of each region. To capture this seasonality in the model, the deterministic component, d t , a dummy variable relative to each day of the week except one, is introduced (to avoid collinearity).
  • A last change in the model is carried out using a dummy variable for a change of scenario: that of the test policy, since tests were not available at the beginning of the pandemic. While at the beginning of the first wave only some suspected cases could be tested for SARS-CoV-2, later the scenario changes and diagnostic testing can be extended to all suspected cases, close contacts and even several mass screenings are performed.
With these changes, the final model formula is given by Equations (3) and (4).
Δ y t = Π 1 y t 1 + Γ 1 , 1 Δ y t 1 + + Γ 1 , p 1 Δ y t p + 1 + ϕ 1 d t + u t = α 1 β 1 y t 1 + Γ 1 , 1 Δ y t 1 + + Γ 1 , p 1 Δ y t p + 1 + ϕ 1 d t + u t for t T 1
and
Δ y t = Π 2 y t 1 + Γ 2 , 1 Δ y t 1 + + Γ 2 , p 1 Δ y t p + 1 + ϕ 2 d t + u t = α 2 β 2 y t 1 + Γ 2 , 1 Δ y t 1 + + Γ 2 , p 1 Δ y t p + 1 + ϕ 2 d t + u t for t > T 1
Once the model is defined, there is only one parameter to decide, the regression order p, that is, the number of lags of each of the series in the equation. For the decision of this parameter and given the ease of calculating this model, a cross-validation procedure is carried out with different p, taking the order p with the smallest mean absolute percentage error.

3.2. Application to the Case Study

To assemble the model, we proceed as described in the previous section.
A logarithmic transformation is applied to the data and the two-to-two cointegration of the series under study is checked with the Engle and Granger [26] procedure and the Phillips–Ouliaris test [26]. Both tests work with the null hypothesis of non-existence of cointegration and the p-values of series two by two are shown in Table 7.
The cointegration relationship shows us that there is a long-term equilibrium between the series. Another present relationship between these is the regressive part. In order to confirm it, a cross-correlation study was carried out among the series of ICU, hospitalized and deaths against the series of confirmed patients. To avoid spurious correlations, a differentiation process is carried out, to transform these into stationary series individually, applying the logarithm and differentiating once.
It is observed in Figure 7a that there are important shifts of the series in the confirmed series (as well as between all of them, outside the scope of this article) indicated by entering the rejection region of the test for correlation 0.
To obtain the optimal order for all of them simultaneously, the auto-regressive order p is optimized, obtaining in this case that the optimal p is 9.
To ensure the adequacy of the model, the independence of the errors is checked. The p-values for the augmented Dickey–Fuller test [26] and the Phillips–Perron test [26] are included in the auto-correlation graph of the residuals in Figure 7b.
No pattern is observed in the auto-correlation graph of the residuals in Figure 7b, and the p-values of the tests allow us to reject the hypothesis of the presence of a unit root, which is why it is concluded that the errors are stationary.
Another good check in the case study is to see how the peak of the wave is predicted. To this end, Figure 7c presents the current date on the x-axis and the peak date on the y-axis. The black line represents the maximum value of the series up to the real date, while the blue line represents the date of the peak predicted by the model.
It can be observed in the graph that, while in the first half of July the peak was predicted in mid-August, in the second half of July the prediction of the peak rises sharply until mid-September, remaining stable in this prediction until the actual date of the peak (18 September 2020).
In addition, it is appreciated that the model is capable of detecting when the peak has passed. This is observed in Figure 7c by noticing how from the real date of the peak (18 September 2020) the model predicts the peak to pass.
To check the adequacy of the model for prediction, error metrics are calculated [27] using cross-validation techniques [28] for fitted and forecast value and presented for frequency (Table 8) and cumulative data (Table 9).
Finally, the results of applying the impulse response function with two different histories are presented. The first represents the data from the beginning of the pandemic to before the second wave. The second represents the data from the beginning of the pandemic to before the third wave. This graph can be observed in Figure 8a.
In this figure, the maximum influence and the moment at which its effects decrease can be observed, the curves being less pronounced.
It should be noted that, initially, this model was not developed for the prediction of confirmed cases, but rather for the prediction of the other three series that depend closely on it. Despite this, the model works really well with this series, but it works better in the inpatient series.
The implementation of the model allows quick execution of the fit and forecast for all the Autonomous Communities (Figure 8b).
For the comparisons between models, we considered the following extension of the SIR model proposed by Castro et al. [10] and an automatic implementation of ARIMA forecast model by Hyndman and Khandakar [29]. We chose the SCIR model because it incorporates the compartment of the deaths into its definition and is formulated through only five parameters. Different extensions of SIR models can be found in [30,31]. The automatic ARIMA forecast model was chosen due to its widespread use in time series forecasting. Some example can be found in [32,33].

4. SCIR Model: A SIR Model with Confinement

The SCIR model includes the usual states of an SIR model plus a class C for individuals sent to confinement who are susceptible, but not infected. Susceptible individuals (S) can enter and exit confinement (C) or become infected (I). Infected individuals can recover (R) or die (D). Figure 9b shows the diagram of the equations of the SCIR model.
For the optimization of the parameters, we included in the objective function an accuracy measure that combines the determination coefficients of the fits of both the cases and the deaths.
In the next section, the comparison between the three models is illustrated with the second wave for the Region of Madrid.

5. Automatic SARIMA Model

The procedure followed to apply the ARIMA automatic adjustment method given in [29] is:
  • Automatic model adjustment, with the selection of the model’s hyper-parameters ( p , d , q ) ( P , D , Q ) and a Box–Cox transformation [34] with parameter λ .
    The result of this step in the data is the model with: p = 1 , d = 1 , q = 2 , P = 0 , D = 1 , Q = 1 and λ = 0.2086024 .
    The following model equation was chosen:
    ( 1 ϕ 1 B ) ( 1 B ) ( 1 B 7 ) y t = ( 1 + θ 1 B + θ 2 B 2 ) ( 1 + Θ 1 B 7 ) ( 1 + B 7 ) ε t
    where B is the backshift operator.
  • In the validation phase, the regression parameters are recalculated with these same hyper-parameters.

6. Other Models

To verify the contribution of the two models developed, other algorithms such as neural networks [35] and machine learning algorithms qwew tested. To present some of the results, the predictions are shown in Figure 10. The same dates for the implementation of random forest and xgboost with a direct forecasting strategy are presented [36].
In [37], a deep long short-term memory network is used very successfully for financial time series forecasting. We tried to apply this neural network to our problem, but the result is not satisfactory. This is undoubtedly due to the fact that the number of samples of each of the waves is very insufficient for the training of a deep learning network.

7. Comparisons

Figure 11 and Figure 12 show the fit and 14-day forecasts with the three models for the time interval corresponding to the second wave for eight different endings of the historical time series (all of them starting on June, 24th), respectively, for the cumulative and non-cumulative absolute frequencies of the cases.
Figure 13 shows the corresponding box-plots for the metrics values obtained with the three models. The means of the metrics are shown in Table 10 and Table 11. The individual values for the eight different data histories are shown in Table 12, Table 13, Table 14 and Table 15. In general, it is noted that our two proposed MATGEN models improve the metric values obtained with the SCIR model in both fitting and forecast.
Figure 14 shows the monitoring of the peak detection achieved with the three models. The model that best detects the peak is the non-linear regression model.

8. Conclusions

A non-linear regression model for count data in time series is developed and implemented by means of an expert system of artificial intelligence. It is based on directly estimating the distribution function of each of the series under study and on the duality between the distribution function and the density function. Since those two functions fully characterize the probability distribution of a variable, our model is able to capture the main characteristics of epidemic outbreaks. The simplicity of MATGEN must also be noted, since it is formulated only through four parameters: μ , σ l e f t , σ r i g h t and n. The monitoring of all model parameters makes it possible to easily quantify and detect the effect of interventions over time. Furthermore, the machine learning algorithm developed is scalable, allows parallel running of different data series and is capable of introducing new data in real time.
We apply this model to the COVID-19 series of the Region of Madrid (Spain) during the first and second waves to give an eight-day forecast for the Spanish Mathematical Initiative [16]. This theoretical framework allows us to detect pandemic peaks and make short- and long-term monitoring and forecasting of the number of people infected, people requiring hospitalization and deaths. This expert system proves very useful to estimate the effectiveness of the interventions prompted by the government, which seems to have an impact after 11 days of its implementation during the first wave. Moreover, it is useful to propose commitment dates to lift the mobility restrictions and to advise on how to proceed in future outbreaks. On May 25, the Region of Madrid entered Phase 1 of the de-escalation. The MATGEN update on May 11 showed commitment dates between May 19 and June 5 (with a forecast of 72,200, 9000 and 4000 for the total numbers of confirmed cases, deaths and ICUs at the end of the pandemic, respectively).
The flexibility of our theoretical framework allows us to fit different regression functions (Gaussian, Gompertz, double Pareto, double exponential and uniform) between different dates. During the first wave, the series of new cases per day and deaths in the Region of Madrid only needed one cutoff and normal models in each part. The ICU series adjustment was more difficult: three cutoffs and different models in a wide family of probability distributions were needed. The computational cost of the last situation is higher than that of the first. In this case, the algorithm needed to detect the appropriate number of cutoffs and tried all the possible combinations of the models belonging to the family considered.
At present, in order to provide 14-day forecasts for the Spanish Mathematical Initiative [16], we implement another MATGEN model based on error correction models, useful for estimating both short- and long-term effects of one time series on another. This model is based on the cointegration of the four series in the study, namely cases, deaths, hospitalizations and ICUs, and it was incorporated into the expert system too.
Finally, the comparison from different points of view of our two models with the SCIR model [10] yields the following conclusions:
  • Among the advantages of using the SCIR model, we can highlight its simplicity, since it is formulated using five parameters and the interpretability of them from an epidemiological point of view.
  • The MATGEN non-linear regression model (formulated with four parameters per series that are easy to monitor) is the most explanatory for studying the peak detection and the effect of interventions. In addition, it is equipped with a control procedure that allows detecting trend changes in the tails that indicate the start of a new wave. Unlike the two other models, the fit of the four series is in parallel.
  • The MATGEN error correction model depends on a greater number of parameters but allows us to approximate the four series simultaneously, and it is the best in all the metrics, both in fit and in forecast. In addition, this model incorporates the impulse response function, a method that allows making inference about the impact of one series on the remaining ones.
Therefore, MATGEN combines our two proposed models in one expert system as a new epidemiological tool that can be proved extremely useful in new COVID-19 outbreaks and future epidemics of infectious diseases.

Author Contributions

Conceptualization, B.G.-P., J.L.S. and C.N.; methodology, B.G.-P., G.V. and J.L.S.; software, B.G.-P., J.M.V., G.V. and J.L.S.; validation, B.G.-P., G.V. and J.L.S.; formal analysis, all authors; investigation, all authors; writing—original draft preparation, B.G.-P., J.M.V. and J.L.S.; writing—review and editing, B.G.-P., J.M.V. and C.N.; visualization, B.G.-P. and J.L.S.; supervision, B.G.-P.; and project administration, B.G.-P. and J.M.V. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by Complutense University of Madrid, Spain, research group 910395 MÉTODOS BAYESIANOS.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Acknowledgments

We thank the reviewers for their suggestions that have helped to improve the initial draft of this article.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Naudé, W. Artificial Intelligence Against Covid-19: An Early Review. IZA Discuss. Pap. 2020, 13110. Available online: https://ssrn.com/abstract=3568314 (accessed on 22 June 2021).
  2. Verelst, F.; Willem, L.; Beutels, F. Behavioural change models for infectious disease transmission: A systematic review (2010–2015). J. R. Soc. Interface 2016, 13, 20160820. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  3. Akhtar, M.; Kraemer, M.U.G.; Gardner, L.M. A dynamic neural network model for predicting risk of Zika in real time. BMC Med. 2019. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  4. Hao, K. This is How the CDC Is Trying to Forecast Coronavirus’s Spread. 2020. Available online: https://www.technologyreview.com/2020/03/13/905313/cdc-cmu-forecasts-coronavirus-spread/ (accessed on 22 June 2021).
  5. Abhari, R.S.; Marini, M.; Chokani, N. COVID-19 Epidemic in Switzerland: Growth Prediction and Containment Strategy Using Artificial Intelligence and Big Data. medRxiv 2020. [Google Scholar] [CrossRef] [Green Version]
  6. MITTechnologyReview. The Best, and the Worst, of the Coronavirus Dashboards. 2020. Available online: https://www.technologyreview.com/2020/03/06/905436/best-worst-coronavirus-dashboards/ (accessed on 22 June 2021).
  7. Ivorra, B.; Ferrández, M.R.; Vela-Pérez, M.; Ramos, A.M. Mathematical modeling of the spread of the coronavirus disease 2019 (COVID-19) taking into account the undetected infections. The case of China. Commun. Nonlinear Sci. Numer. Simul. 2020, 88, 105303. [Google Scholar] [CrossRef] [PubMed]
  8. Wang, L.; Zhou, Y.; He, J.; Zhu, B.; Wang, F.; Tang, L.; Eisenberg, M.; Song, P.X.K. An epidemiological forecast model and software assessing interventions on COVID-19 epidemic in China. medRxiv 2020. [Google Scholar] [CrossRef]
  9. Maier, B.F.; Brockmann, D. Effective containment explains sub-exponential growth in confirmed cases of recent COVID-19 outbreak in Mainland China. medRxiv 2020. [Google Scholar] [CrossRef] [Green Version]
  10. Castro, M.; Ares, S.; Cuesta, J.A.; Manrubia, S. The turning point and end of an expanding epidemic cannot be precisely forecast. Proc. Natl. Acad. Sci. USA 2020, 117, 26190–26196. [Google Scholar] [CrossRef]
  11. Ramos, A.; Ferrández, M.; Vela-Pérez, M.; Kubik, A.; Ivorra, B. A simple but complex enough θ-SIR type model to be used with COVID-19 real data. Application to the case of Italy. Phys. D Nonlinear Phenom. 2021, 421, 132839. [Google Scholar] [CrossRef] [PubMed]
  12. Sánchez-Villegas, P.; Daponte Codina, A. Predictive models of the COVID-19 epidemic in Spain with Gompertz curves. Gac. Sanit. 2020. [Google Scholar] [CrossRef] [PubMed]
  13. Nelder, J.A.; Mead, R. A Simplex Method for Function Minimization. Comput. J. 1965, 7, 308–313. [Google Scholar] [CrossRef]
  14. Nash, J.C. Compact Numerical Methods for Computers: Linear Algebra and Function Minimisation; Hilger: Bristol, UK; New York, NY, USA, 1990. [Google Scholar]
  15. Català, M.; Alonso, S.; Alvarez-Lacalle, E.; López, D.; Cardona, P.J.; Prats, C. Empiric model for short-time prediction of COVID-19 spreading. medRxiv 2020. [Google Scholar] [CrossRef]
  16. CEMAT. Cooperative Prediction. 2021. Available online: https://covid19.citic.udc.es/ (accessed on 22 June 2021).
  17. Quesada, V.; Pardo, L. Curso Superior de Probabilidades; Promociones y Publicaciones Universitarias (PPU): Madrid, Spain, 1988. [Google Scholar]
  18. ISCIII. Instituto de Salud Carlos III. 2020. Available online: https://cnecovid.isciii.es/covid19 (accessed on 22 June 2021).
  19. Gómez Villegas, M.A. Inferencia Estadística; Díaz de Santos: Madrid, Spain, 2011. [Google Scholar]
  20. Lauer, S.A.; Grantz, K.H.; Bi, Q.; Jones, F.K.; Zheng, Q.; Meredith, H.R.; Azman, A.S.; Reich, N.G.; Lessler, J. The Incubation Period of Coronavirus Disease 2019 (COVID-19) from Publicly Reported Confirmed Cases: Estimation and Application. Ann. Intern. Med. 2020, 172, 577–582. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  21. OpenDataUE. Portal for access to COVID-19 Open Data of the European Union. 2020. Available online: https://data.europa.eu/euodp/es/data/dataset/covid-19-coronavirus-data (accessed on 22 June 2021).
  22. githubItaly. Dati Andmento Nazionale. 2021. Available online: https://github.com/pcm-dpc/COVID-19/tree/master/dati-andamento-nazionale (accessed on 22 June 2021).
  23. githubItalyProvince. Dati Province. 2021. Available online: https://github.com/pcm-dpc/COVID-19/tree/master/dati-province (accessed on 22 June 2021).
  24. githubItalyRegioni. Dati Regioni. 2021. Available online: https://github.com/pcm-dpc/COVID-19/tree/master/dati-regioni (accessed on 22 June 2021).
  25. Lütkepohl, H. New Introduction to Multiple Time Series Analysis; Springer: Berlin, Germany, 2005. [Google Scholar]
  26. Trapletti, A.; Hornik, K. Tseries: Time Series Analysis and Computational Finance, R package version 0.10-47; 2019. [Google Scholar]
  27. Hyndman, R.; Athanasopoulos, G.; Bergmeir, C.; Caceres, G.; Chhay, L.; O’Hara-Wild, M.; Petropoulos, F.; Razbash, S.; Wang, E.; Yasmeen, F. Forecast: Forecasting Functions for Time Series and Linear Models, R package version 8.13; University of Bath: Bath, UK, 2020. [Google Scholar]
  28. Bergmeir, C.; Hyndman, R.J.; Koo, B. A note on the validity of cross-validation for evaluating autoregressive time series prediction. Comput. Stat. Data Anal. 2018, 120, 70–83. [Google Scholar] [CrossRef]
  29. Hyndman, R.J.; Khandakar, Y. Automatic time series forecasting: The forecast package for R. J. Stat. Softw. 2008, 26, 1–22. [Google Scholar]
  30. Uddin, M.S.; Nasseef, M.T.; Mahmud, M.; AlArjani, A. Mathematical Modelling in Prediction of Novel CoronaVirus (COVID-19) Transmission Dynamics. Preprints 2020. [Google Scholar] [CrossRef]
  31. Pazos, F.; Felicioni, F.E. A control approach to the Covid-19 disease using a SEIHRD dynamical model. medRxiv 2020. [Google Scholar] [CrossRef]
  32. Alghamdi, T.; Elgazzar, K.; Bayoumi, M.; Sharaf, T.; Shah, S. Forecasting Traffic Congestion Using ARIMA Modeling. In Proceedings of the 2019 15th International Wireless Communications & Mobile Computing Conference (IWCMC), Tangier, Morocco, 24–28 June 2019; pp. 1227–1232. [Google Scholar] [CrossRef]
  33. Yermal, L.; Balasubramanian, B. Application of Auto ARIMA Model for Forecasting Returns on Minute Wise Amalgamated Data in NSE. In Proceedings of the 2017 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC), Coimbatore, India, 14–16 December 2017; pp. 1–5. [Google Scholar] [CrossRef]
  34. Box, G.E.P.; Cox, D.R. An Analysis of Transformations. J. R. Stat. Soc. Ser. B Methodol. 1964, 26, 211–243. [Google Scholar] [CrossRef]
  35. Hyndman, R.; Athanasopoulos, G. Forecasting: Principles and Practice, 2nd ed.; OTexts: Melbourne, Australia, 2018. [Google Scholar]
  36. Taieb, S.B.; Bontempi, G.; Atiya, A.; Sorjamaa, A. A review and comparison of strategies for multi-step ahead time series forecasting based on the NN5 forecasting competition. arXiv 2011, arXiv:1108.3259. [Google Scholar]
  37. Vochozka, M.; Vrbka, J.; Suler, P. Bankruptcy or Success? The Effective Prediction of a Company’s Financial Development Using LSTM. Sustainability 2020, 12, 7529. [Google Scholar] [CrossRef]
Figure 1. Probability distributions for pandemic wave modeling.
Figure 1. Probability distributions for pandemic wave modeling.
Mathematics 09 01485 g001
Figure 2. Cases (a), Deaths (b), ICUs (c) and parameter variation per day (d). (ac) From up to bottom: distribution function adjustment, density function adjustment and noise on May 11.
Figure 2. Cases (a), Deaths (b), ICUs (c) and parameter variation per day (d). (ac) From up to bottom: distribution function adjustment, density function adjustment and noise on May 11.
Mathematics 09 01485 g002aMathematics 09 01485 g002b
Figure 3. Second wave: cases (normal) (top-left); deaths (normal) (top-right); hospitalizations (normal) (bottom-left); and ICUs (Gompertz) (bottom-right).
Figure 3. Second wave: cases (normal) (top-left); deaths (normal) (top-right); hospitalizations (normal) (bottom-left); and ICUs (Gompertz) (bottom-right).
Mathematics 09 01485 g003aMathematics 09 01485 g003b
Figure 4. Second wave: cases parameter monitoring.
Figure 4. Second wave: cases parameter monitoring.
Mathematics 09 01485 g004
Figure 5. Second wave cases Asturias: double exponential.
Figure 5. Second wave cases Asturias: double exponential.
Mathematics 09 01485 g005
Figure 6. Second wave cases Asturias: double Pareto.
Figure 6. Second wave cases Asturias: double Pareto.
Mathematics 09 01485 g006
Figure 7. Second Wave in Madrid.
Figure 7. Second Wave in Madrid.
Mathematics 09 01485 g007
Figure 8. IRF and adjusted graph for each Region.
Figure 8. IRF and adjusted graph for each Region.
Mathematics 09 01485 g008
Figure 9. Expert system and SCIR diagram. (a) Expert system graphical interface. (b) Diagram of the SCIR model [10].
Figure 9. Expert system and SCIR diagram. (a) Expert system graphical interface. (b) Diagram of the SCIR model [10].
Mathematics 09 01485 g009
Figure 10. ML models example.
Figure 10. ML models example.
Mathematics 09 01485 g010
Figure 11. (Top) Cumulative cases adjustment. (Bottom) Cumulative cases forecast at 14 days.
Figure 11. (Top) Cumulative cases adjustment. (Bottom) Cumulative cases forecast at 14 days.
Mathematics 09 01485 g011
Figure 12. (Top) Non-cumulative cases adjustment. (Bottom) Non-cumulative cases forecast at 14 days.
Figure 12. (Top) Non-cumulative cases adjustment. (Bottom) Non-cumulative cases forecast at 14 days.
Mathematics 09 01485 g012
Figure 13. (Left) Cumulative cases metric comparisons. (Right) Non-cumulative cases metric comparisons.
Figure 13. (Left) Cumulative cases metric comparisons. (Right) Non-cumulative cases metric comparisons.
Mathematics 09 01485 g013
Figure 14. Peak prediction comparisons.
Figure 14. Peak prediction comparisons.
Mathematics 09 01485 g014
Table 1. The .csv file with the data (continues in Table 2).
Table 1. The .csv file with the data (continues in Table 2).
Cases
11/05/2020
Cases
18/05/2020
Cases
21/05/2020
Deaths
11/05/2020
ICUs
21/05/2020
25/02/202026122
26/02/202027566
27/02/20202891010
28/02/202029192020
29/02/202030262727
01/03/202031515353
02/03/202032939696
03/03/202033139142142
04/03/202034193199199
05/03/202035305311311
06/03/202036508515515
07/03/202037729738738
08/03/202038992100310031661
09/03/20203914951508150821120
10/03/20204021982213221331184
11/03/20204129222943294356238
12/03/20204237053732373281307
13/03/20204346454672467286370
14/03/202044554455765576213469
15/03/202045635663926392213566
16/03/202046761576537653355702
17/03/202047956196019601390850
18/03/20204811,30911,35611,3564981011
19/03/20204913,35313,39913,3996281196
20/03/20205015,67615,72215,7228041401
21/03/20205117,34617,39717,39710211532
22/03/20205218,84818,90018,90012631664
23/03/20205321,51621,56921,56915351813
24/03/20205424,40424,47324,47518251962
25/03/20205527,34427,42027,42220902117
26/03/20205630,71130,79430,79624122272
27/03/20205733,06833,16033,16227572369
28/03/20205834,08734,18634,18930822423
29/03/20205934,95935,05835,06133922464
30/03/20206037,50437,60437,60736032554
31/03/20206139,30339,40439,40938652627
01/04/20206241,07541,19141,19941752694
02/04/20206342,89643,02743,03844832764
Table 2. The .csv with the data.
Table 2. The .csv with the data.
Cases
11/05/2020
Cases
18/05/2020
Cases
21/05/2020
Deaths
11/05/2020
ICUs
21/05/2020
03/04/20206444,61344,76844,77947232821
04/04/20206545,49645,66045,67149412854
05/04/20206646,01646,18646,19751362879
06/04/20206747,56847,74947,76353712958
07/04/20206848,94549,13949,16055863002
08/04/20206950,35750,55650,58058003038
09/04/20207051,29651,50551,53559723061
10/04/20207152,14652,36752,40060843091
11/04/20207252,68052,90952,94462783105
12/04/20207353,01753,25053,28564233122
13/04/20207453,98854,24154,28765683153
14/04/20207555,06255,34355,39867243180
15/04/20207655,95956,24556,30468773203
16/04/20207756,79257,08457,15770073214
17/04/20207857,59857,89957,97871323228
18/04/20207960,55860,85960,94172393238
19/04/20208060,95261,25461,33673513248
20/04/20208161,56861,88261,97274603278
21/04/20208262,44062,80162,90475773283
22/04/20208363,55863,92164,05076843288
23/04/20208464,12064,49664,63477653305
24/04/20208564,78565,16365,31078483307
25/04/20208665,01565,39665,54679223308
26/04/20208765,47765,85766,00779863309
27/04/20208866,24166,63166,78480483338
28/04/20208966,88467,29367,46081053355
29/04/20209067,33267,74767,94281763377
30/04/20209167,71468,15468,35682223392
01/05/20209267,83068,29168,53782923404
02/05/20209367,94768,40868,65483323421
03/05/20209468,05668,52068,76683763431
04/05/20209568,44768,92469,17884203442
05/05/20209668,74569,24969,50984663465
06/05/20209769,11069,62769,88585043485
07/05/20209869,32369,85670,12585523493
08/05/20209969,56670,13270,40785983508
09/05/202010069,69770,23870,51686443520
10/05/202010169,73070,29270,57086833529
Table 3. Cases updated on May 18: the .csv file with the results of the entire day-to-day history of the process (continues in Table 4).
Table 3. Cases updated on May 18: the .csv file with the results of the entire day-to-day history of the process (continues in Table 4).
DateDayF.obsf.obsPeak μ σ left σ right n R 2 p-Value qnorm 0.99 qnorm 0.999
29/02/20203027730323.301000.9825876680.8921769234042
01/03/202031532631353.604000.9786716910.6227047424346
02/03/202032964332342.904000.9818768060.1754582054143
03/03/2020331424633384.1013000.9957706750.3922343474851
04/03/2020341995734342.804000.9977455530.1378567744143
05/03/20203531111235414.7031000.9968176450.2408832445256
06/03/20203651520436424.8049000.9875848960.2848974955357
07/03/20203773822337424.7052000.9946762390.1243397585357
08/03/202038100326538445.2081000.9985400430.2391981365660
09/03/202039150850539465.5014,9000.9967793560.1314242785963
10/03/202040221370540475.7020,2000.9927833370.3003311286065
11/03/202041294373041485.9025,0000.9978123510.3156913226266
12/03/202042373278942485.9024,4000.9987981160.2706839416266
13/03/202043467294043465.4016,2000.9991362180.1114314865963
14/03/202044557690443465.5015,8000.9989521290.5717899715963
15/03/202045639281643455.2013,0000.9989449340.2353423675761
16/03/2020467653126146455.24.413,0000.9992699030.3635508615559
17/03/2020479601194847455.23.113,0000.9981236740.5790212515255
18/03/20204811,356175547496.3026,0000.9967325740.2724786176468
19/03/20204913,399204349506.5030,6000.9967790950.4900699596570
20/03/20205015,722232350516.8035,7000.9970980730.1219770096772
21/03/20205117,397167550516.8035,0000.9981828810.1806654586772
22/03/20205218,900150350516.89.935,0000.9986019890.1200794197482
23/03/20205321,569266953516.86.534,8000.9985613820.1383912366671
24/03/20205424,473290454516.85.434,5000.9982296050.1213598956468
25/03/20205527,420294755516.84.533,8000.9972544680.1057644936165
26/03/20205630,794337456557.9056,0000.9960147290.119405297379
27/03/20205733,160236656547.5050,6000.99692150.5293165647177
28/03/20205834,186102656537.3045,4000.9979132730.180530517076
29/03/20205935,05887256537.4045,2000.9981733570.1056511547076
Table 4. Cases updated on May 18: the .csv file with the results of the entire day-to-day history of the process (continues in Table 5).
Table 4. Cases updated on May 18: the .csv file with the results of the entire day-to-day history of the process (continues in Table 5).
DateDayF.obsf.obsPeak μ σ left σ right n R 2 p-Value qnorm 0.99 qnorm 0.999
30/03/20206037,604254656537.3045,3000.9983034860.1721275097076
31/03/20206139,404180056537.2045,5000.9982453080.4705497067075
01/04/20206241,191178756547.58.750,2000.9984330350.7545924177481
02/04/20206343,027183656547.58.650,5000.9984281450.506569967481
03/04/20206444,768174156547.58.450,7000.9981841980.2840941237480
04/04/20206545,66089256547.58.450,5000.9985412950.3499225147480
05/04/20206646,18652656547.58.650,3000.9988786310.5985958317481
06/04/20206747,749156356547.58.651,1000.9984645450.1835563467481
07/04/20206849,139139056547.59.152,4000.9974321180.0586911877582
08/04/20206950,556141756557.8052,5000.9960179110.1111614427379
09/04/20207051,50594956557.911.156,5000.9989571630.0545098318189
10/04/20207152,36786256557.911.156,6000.9989543790.0371959498189
11/04/20207252,90954256557.911.256,6000.9990936680.0527898498190
12/04/20207353,25034156557.911.156,2000.9992700960.1100863428189
13/04/20207454,24199156557.911.356,9000.9990501730.0300396668190
14/04/20207555,343110256557.912.258,3000.9984668270.0098817598393
15/04/20207656,24590256557.912.559,0000.9978720560.0027365768494
16/04/20207757,08483956557.91359,8000.9972639610.0008603098595
17/04/20207857,89981556568.214.762,1000.9991480190.05133014790101
18/04/20207960,859296056568.216.266,0000.9949846480.01608133494106
19/04/20208061,25439556568.216.365,9000.9957363320.02421551194106
20/04/20208161,88262856568.216.466,1000.9956442810.02079363394107
21/04/20208262,80191956568.217.267,2000.9946758110.00994968296109
22/04/20208363,921112056578.519.170,0000.9968946710.031130121101116
23/04/20208464,49657556578.519.270,1000.9970002170.031958511102116
24/04/20208565,16366756578.52070,9000.9968641210.026171537104119
25/04/20208665,39623356578.520.470,9000.9974867640.055788057104120
26/04/20208765,85746156578.520.370,8000.9976142110.060271108104120
27/04/20208866,63177456578.520.871,5000.9972060520.034021395105121
28/04/20208967,29366256578.521.672,3000.9969087850.02468125107124
29/04/20209067,74745456578.521.772,4000.9969571140.025102979107124
Table 5. Cases updated on May 18: the .csv file with the results of the entire day-to-day history of the process.
Table 5. Cases updated on May 18: the .csv file with the results of the entire day-to-day history of the process.
DateDayF.obsf.obsPeak μ σ left σ right n R 2 p-Value qnorm 0.99 qnorm 0.999
30/04/20209168,15440756578.521.772,4000.9970541780.026233746107124
01/05/20209268,29113756578.524.473,9000.9971450560.101063259114132
02/05/20209368,40811756578.521.771,9000.9978747450.12149971107124
03/05/20209468,52011256578.521.971,8000.998127670.225740331108125
04/05/20209568,92440456578.521.971,9000.998097950.174545762108125
05/05/20209669,24932556578.522.472,2000.9981255970.188507265109126
06/05/20209769,62737856578.522.772,5000.9980566750.160258625110127
07/05/20209869,85622956578.522.672,4000.9981418990.166678645110127
08/05/20209970,13227656578.52372,6000.9981761150.183701463111128
09/05/202010070,23810656578.522.872,4000.9982869440.246999497110127
10/05/202010170,2925456578.522.772,2000.9984036160.382879386110127
11/05/202010270,48219056578.522.772,2000.9984359830.369879957110127
12/05/202010370,77529356578.522.972,4000.9984096860.270053998110128
13/05/202010470,96418956578.523.172,5000.9984259210.276756083111128
14/05/202010571,2803165658922.372,6000.9990840540.158835446110127
15/05/202010671,5722925658922.672,8000.9990976340.10264384111128
16/05/202010771,590185658922.472,7000.9991150330.164105561110127
17/05/202010871,59555658922.372,6000.999130690.284532441110127
Table 6. Cases: The .csv file with the results of the forecast with the fitted model updated on 11 May 2020 with data until May 10 and the real data updated on 21 May 2020.
Table 6. Cases: The .csv file with the results of the forecast with the fitted model updated on 11 May 2020 with data until May 10 and the real data updated on 21 May 2020.
DateDayCases
Forecast
Deaths
Forecast
ICUs
Forecast
Cases
21/05
Deaths
21/05
ICUs
21/05
11/0510269,907.526648760.0860753522.26812770,76487203543
12/0510370,069.385468785.6758763532.11791671,06487603555
13/0510470,217.067638808.9310513541.76462471,27387793564
14/0510570,351.546588830.0153383551.21243871,61688093574
15/0510670,473.758958849.0868413560.46545871,93288263577
16/0510770,584.602568866.2975023569.52770271,95688473584
17/0510870,684.934868881.7926893578.40310271,99588633594
18/0510970,775.571838895.7108793587.0955172,12188943600
Table 7. Cointegration tests for Madrid series.
Table 7. Cointegration tests for Madrid series.
TestCombinationp-Value
Phillips–Ouliarisconfirmed–hosp<0.01
Engle and Grangerconfirmed–hosp<0.01
Phillips–Ouliarisconfirmed–icu<0.01
Engle and Grangerconfirmed–icu<0.01
Phillips–Ouliarisconfirmed–deaths<0.01
Engle and Grangerconfirmed–deaths<0.01
Phillips–Ouliarishosp–icu<0.01
Engle and Grangerhosp–icu<0.01
Phillips–Ouliarishosp–deaths<0.01
Engle and Grangerhosp–deaths<0.01
Phillips–Ouliarisicu–deaths<0.01
Engle and Grangericu–deaths<0.01
Table 8. Metrics of error correction model for punctual data.
Table 8. Metrics of error correction model for punctual data.
MetricTrainTest
MAPE 28.5
MPE 8.76
R 2 0.9490.798
Table 9. Metrics of the error correction model for cumulative data.
Table 9. Metrics of the error correction model for cumulative data.
MetricTrainTest
MAPE14.58563.3848
MPE11.16011.6564
R 2 0.99980.9913
Table 10. Mean of the metrics values for cumulative data. Error Correction Model (ECM) and Non-Linear Regression Model (NLRM).
Table 10. Mean of the metrics values for cumulative data. Error Correction Model (ECM) and Non-Linear Regression Model (NLRM).
ModelTypeMERMSEMAEMPEMAPER2
1Auto ARIMAadjusted−1248.541728.101269.68−7.858.591.00
2Auto ARIMAprediction−1112.816833.944632.59−1.403.810.99
3ECM (MATGEN)adjusted81.201345.88894.42−0.815.091.00
4ECM (MATGEN)prediction−19.956932.534867.941.444.060.99
5NLRM (MATGEN)adjusted1348.812443.291790.6616.3917.371.00
6NLRM (MATGEN)prediction2073.897198.785863.882.444.850.99
7SIRadjusted−6007.578953.816686.95−17,728.7117,734.670.99
8SIRprediction−6714.018943.467271.55−6.847.610.99
Table 11. Mean of the metrics values for non-cumulative data. Error Correction Model (ECM) and Non-Linear Regression Model (NLRM).
Table 11. Mean of the metrics values for non-cumulative data. Error Correction Model (ECM) and Non-Linear Regression Model (NLRM).
ModelTypeMERMSEMAEMPEMAPER2
1Auto ARIMAadjusted−28.51340.40219.37−4.2621.070.96
2Auto ARIMAprediction283.291376.83891.92−28.5956.270.71
3ECM (MATGEN)adjusted−14.78402.27250.27−0.6717.670.94
4ECM (MATGEN)prediction186.951133.20853.33−11.5439.330.74
5NLRM (MATGEN)adjusted6.321052.47766.999.0248.030.58
6NLRM (MATGEN)prediction110.351785.721506.44−36.8180.400.28
7SIRadjusted−62.881073.00779.05−13,852.9713,888.640.57
8SIRprediction−402.711789.381595.02−18.2261.420.24
Table 12. Metrics’ values for the adjustment of the cumulative data. Error Correction Model (ECM) and Non-Linear Regression Model (NLRM).
Table 12. Metrics’ values for the adjustment of the cumulative data. Error Correction Model (ECM) and Non-Linear Regression Model (NLRM).
SetModelMERMSEMAEMPEMAPER2
1Auto ARIMA−485.88700.55486.71−12.3413.810.99
1ECM (MATGEN)168.90400.31289.59−2.188.241.00
1NLRM (MATGEN)1260.451551.801291.4533.5533.700.94
1SIR−1522.241823.161522.24−46,288.9646,288.960.92
2Auto ARIMA−858.091201.04858.72−10.8411.950.99
2ECM (MATGEN)253.20477.60349.93−1.966.661.00
2NLRM (MATGEN)764.841255.901035.1222.3523.210.99
2SIR2212.522870.742220.49−12,599.7512,637.410.96
3Auto ARIMA−978.931323.82979.50−10.2011.211.00
3ECM (MATGEN)302.09519.49392.29−1.716.241.00
3NLRM (MATGEN)589.181511.351251.0420.6122.350.99
3SIR1628.682289.741736.60−11,456.2111,487.040.99
4Auto ARIMA−1088.731470.231089.27−9.9110.871.00
4ECM (MATGEN)389.35627.22473.02−1.266.091.00
4NLRM (MATGEN)642.381498.331246.0019.5521.101.00
4SIR−3325.183955.143325.18−31,652.1431,652.140.97
5Auto ARIMA−1375.221860.101375.71−9.3210.181.00
5ECM (MATGEN)522.39850.41642.24−0.555.821.00
5NLRM (MATGEN)508.561823.341392.0716.2818.561.00
5SIR−3793.624639.003817.71−28,582.6728,582.700.98
6Auto ARIMA−1530.032051.451530.49−9.019.811.00
6ECM (MATGEN)436.65910.16696.56−0.525.601.00
6NLRM (MATGEN)616.641643.561324.5617.5718.931.00
6SIR−6159.507507.136159.50−22,241.4822,241.480.95
7Auto ARIMA−1630.212043.771630.49−5.876.361.00
7ECM (MATGEN)39.241406.281048.43−0.303.921.00
7NLRM (MATGEN)2051.993257.692473.1211.3312.011.00
7SIR−11,425.6513,086.2611,425.65−13,595.7613,595.760.98
8Auto ARIMA−1257.971796.901340.12−4.424.821.00
8ECM (MATGEN)−513.382151.911649.35−0.213.331.00
8NLRM (MATGEN)2271.703192.562437.2810.6910.921.00
8SIR−10,433.7711,931.4510,433.77−5642.225642.220.99
Table 13. Metrics values for the forecast of cumulative data. Error Correction Model (ECM) and Non-Linear Regression Model (NLRM).
Table 13. Metrics values for the forecast of cumulative data. Error Correction Model (ECM) and Non-Linear Regression Model (NLRM).
SetModelMERMSEMAEMPEMAPER2
1Auto ARIMA742.272607.381972.180.224.400.95
1ECM (MATGEN)8078.1510,443.868078.1513.9013.900.22
1NLRM (MATGEN)−1722.692601.552217.13−3.685.640.95
1SIR−8332.409505.038332.40−24.0424.040.35
2Auto ARIMA−361.361250.371059.50−0.851.610.99
2ECM (MATGEN)1986.202158.571986.202.482.480.98
2NLRM (MATGEN)9290.2510,806.199418.2510.1110.340.47
2SIR1505.192685.562262.732.223.08.628.62
3Auto ARIMA−7682.188583.727682.18−8.628.620.74
3ECM (MATGEN)−3316.374683.553621.03−3.273.700.92
3NLRM (MATGEN)6570.617325.776570.616.346.340.81
3SIR−2875.424134.213236.74−2.893.340.94
4Auto ARIMA−4195.204234.324195.20−4.234.230.95
4ECM (MATGEN)−2057.252967.202111.82−1.751.810.98
4NLRM (MATGEN)5299.335672.035299.334.954.950.92
4SIR−4942.526137.305013.43−4.624.710.90
5Auto ARIMA3244.988616.596127.891.444.070.80
5ECM (MATGEN)1976.524041.052769.671.121.820.96
5NLRM (MATGEN)4409.424942.494543.913.193.290.93
5SIR−5467.075976.665467.07−4.224.220.90
6Auto ARIMA6351.7411,367.518513.563.244.950.51
6ECM (MATGEN)6700.059047.666849.743.843.960.69
6NLRM (MATGEN)5235.726889.865807.063.093.530.82
6SIR−16,797.0917,040.0616,797.09−12.6312.63−0.09
7Auto ARIMA−708.881054.81868.51−0.280.340.98
7ECM (MATGEN)−3022.853068.543022.85−1.211.210.83
7NLRM (MATGEN)−4102.315676.484633.04−1.641.860.41
7SIR−8786.208826.308786.20−3.633.63−0.42
8Auto ARIMA−6293.848310.956641.69−2.132.25−0.09
8ECM (MATGEN)−10,504.0311,506.5110,504.03−3.613.61−1.09
8NLRM (MATGEN)−8389.2210,053.778421.72−2.862.87−0.60
8SIR−9156.269312.669156.26−3.213.21−4.44
Table 14. Metrics values for the adjustment of the cumulative data. Error Correction Model (ECM) and Non-Linear Regression Model (NLRM).
Table 14. Metrics values for the adjustment of the cumulative data. Error Correction Model (ECM) and Non-Linear Regression Model (NLRM).
SetModelMERMSEMAEMPEMAPER2
1Auto ARIMA−36.55200.52117.06−7.8225.310.92
1ECM (MATGEN)20.56171.1995.720.9916.630.94
1NLRM (MATGEN)4.78435.54298.2020.6647.930.64
1SIR−74.26434.98300.27−34,245.4934,267.180.64
2Auto ARIMA−49.77288.63179.84−5.3723.940.94
2ECM (MATGEN)14.12326.32181.570.5317.400.93
2NLRM (MATGEN)4.27702.71483.6113.9946.540.66
2SIR32.40721.61505.11−12,598.9012,658.230.64
3Auto ARIMA−44.57295.50188.00−4.4522.920.95
3ECM (MATGEN)10.46323.84194.940.1518.090.94
3NLRM (MATGEN)46.30868.08605.9913.9148.450.56
3SIR24.38843.97604.11−11,454.8811,513.440.58
4Auto ARIMA−58.22297.94194.23−4.6422.150.96
4ECM (MATGEN)1.37351.12212.710.0717.860.94
4NLRM (MATGEN)5.11863.13606.8112.2146.720.65
4SIR−85.49874.68640.00−23,183.4823,209.360.64
5Auto ARIMA−65.79295.95196.90−4.3020.500.97
5ECM (MATGEN)−23.47401.64250.76−0.2217.710.94
5NLRM (MATGEN)3.021021.29731.3810.0046.990.64
5SIR−93.381044.13769.12−20,880.9120,908.290.63
6Auto ARIMA−58.64288.40191.54−3.8619.560.97
6ECM (MATGEN)−30.64427.64268.51−0.2517.560.94
6NLRM (MATGEN)2.151071.60798.4111.1347.700.63
6SIR−122.061169.80856.83−16,662.5416,688.030.56
7Auto ARIMA−0.20405.01268.60−3.4419.760.95
7ECM (MATGEN)−27.63476.34319.69−1.3717.780.93
7NLRM (MATGEN)1.681276.711001.694.2648.290.50
7SIR−101.721314.401011.77−10,167.6410,199.500.47
8Auto ARIMA6.03396.20264.21−3.5919.910.94
8ECM (MATGEN)−27.85436.26285.29−1.9017.740.93
8NLRM (MATGEN)0.171192.96921.713.6749.280.47
8SIR−45.291183.58899.62−4958.014995.860.48
Table 15. Metrics values for the forecast of the non-cumulative data. Error Correction Model (ECM) and Non-Linear Regression Model (NLRM).
Table 15. Metrics values for the forecast of the non-cumulative data. Error Correction Model (ECM) and Non-Linear Regression Model (NLRM).
SetModelMERMSEMAEMPEMAPER2
1Auto ARIMA516.72770.60656.0722.2126.730.71
1ECM (MATGEN)1292.291665.301313.8233.2833.97−0.35
1NLRM (MATGEN)−123.291393.851317.77−6.6558.840.05
1SIR−778.541584.901359.83−48.4984.86−0.23
2Auto ARIMA236.05312.25278.0510.5811.380.97
2ECM (MATGEN)137.44442.86379.734.2013.920.94
2NLRM (MATGEN)1003.111887.681462.8423.8334.71−0.00
2SIR−385.051766.601666.21−11.2354.820.12
3Auto ARIMA−848.621028.91848.62−27.4227.420.70
3ECM (MATGEN)−629.061028.97750.27−22.4426.120.70
3NLRM (MATGEN)476.071845.241564.2311.0835.720.05
3SIR−661.611912.081765.00−20.4555.07−0.02
4Auto ARIMA−51.16298.94213.270.666.420.98
4ECM (MATGEN)−427.13902.31636.70−11.8718.960.84
4NLRM (MATGEN)282.852087.311943.957.6245.320.13
4SIR−388.762141.242020.72−9.8355.550.09
5Auto ARIMA1664.092240.331664.0928.0228.02−0.07
5ECM (MATGEN)738.731046.20912.7317.5421.550.77
5NLRM (MATGEN)459.342123.001908.6210.5743.080.04
5SIR49.682084.131929.121.3348.040.07
6Auto ARIMA2005.822518.262005.8235.4835.48−0.43
6ECM (MATGEN)1372.001633.921459.0728.8930.130.40
6NLRM (MATGEN)695.402054.181639.2915.3535.800.05
6SIR−753.062119.061943.78−23.2161.44−0.01
7Auto ARIMA−126.86317.46239.17−12.8920.960.87
7ECM (MATGEN)−17.64459.68295.661.5617.840.72
7NLRM (MATGEN)−702.711138.53967.96−104.96132.27−0.71
7SIR223.08897.94822.9412.0947.43−0.06
8Auto ARIMA−1129.681365.441230.28−285.34293.76−1.42
8ECM (MATGEN)−971.041206.391078.69−143.49152.12−0.89
8NLRM (MATGEN)−1207.971489.811246.88−251.29257.50−1.88
8SIR−636.581066.63952.83−70.27104.03−0.39
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

González-Pérez, B.; Núñez, C.; Sánchez, J.L.; Valverde, G.; Velasco, J.M. Expert System to Model and Forecast Time Series of Epidemiological Counts with Applications to COVID-19. Mathematics 2021, 9, 1485. https://doi.org/10.3390/math9131485

AMA Style

González-Pérez B, Núñez C, Sánchez JL, Valverde G, Velasco JM. Expert System to Model and Forecast Time Series of Epidemiological Counts with Applications to COVID-19. Mathematics. 2021; 9(13):1485. https://doi.org/10.3390/math9131485

Chicago/Turabian Style

González-Pérez, Beatriz, Concepción Núñez, José L. Sánchez, Gabriel Valverde, and José Manuel Velasco. 2021. "Expert System to Model and Forecast Time Series of Epidemiological Counts with Applications to COVID-19" Mathematics 9, no. 13: 1485. https://doi.org/10.3390/math9131485

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop