Comparison of Positivity in Two Epidemic Waves of COVID-19 in Colombia with FDA

Urbano-Leon, Cristhian Leonardo; Escabias, Manuel

doi:10.3390/stats5040059

Open AccessArticle

Comparison of Positivity in Two Epidemic Waves of COVID-19 in Colombia with FDA

by

Cristhian Leonardo Urbano-Leon

^†

and

Manuel Escabias

^*,†

Department of Statistics and Operations Research, University of Granada, 18071 Granada, Spain

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Stats 2022, 5(4), 993-1003; https://doi.org/10.3390/stats5040059

Submission received: 2 October 2022 / Revised: 22 October 2022 / Accepted: 24 October 2022 / Published: 28 October 2022

(This article belongs to the Section Applied Stochastic Models)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

We use the functional data methodology to examine whether there are significant differences between two waves of contagion by COVID-19 in Colombia between 7 July 2020 and 20 July 2021. A pointwise functional t-test is initially used, then an alternative statistical test proposal for paired samples is presented, which has a theoretical distribution and performs well in small samples. Our statistical test generates a scalar p-value, which provides a global idea about the significance of the positivity curves, complementing the existing punctual tests, as an advantage.

Keywords:

COVID-19; FDA; hypothesis testing for paired functional data; Colombian national strike

1. Introduction

Functional data analysis (FDA) is a branch of statistics that has had great development in recent years due to its multiple applications in different fields of science [1,2,3,4,5,6,7]. One of the reasons for its popularity is that all consecutive observations of a continuous phenomenon can be viewed as a single curve, since the objective of this field is to analyze sets of curves. FDA has a wide range of descriptive and inferential techniques to accomplish this [8,9].

During the global emergency caused by COVID-19, the Colombian government chose the positivity rate as a key variable for early decisions regarding the management of disease, which can be calculated over multiple periods (daily, weekly, monthly, etc.). In this study, the rate of positivity refers to the daily percentage of positive COVID-19 tests for the total number of tests processed. Its trend helps determine the presence of a wave epidemic outbreak [10,11].

Because the positivity rate is related to waves of contagion [10,12], we assume that confirmed coronavirus cases in Colombia in one year serve to identify the most critical moments of a trend change in that time. As the information discriminated by departments in Colombia was only reported from July 2020, we have usable information from 19 July 2020 onward; we can see several contagion waves, but we focus our attention on two of them, depicted in Figure 1.

In the year 2021, Colombia witnessed strong protests. People took to the streets due to multiple social factors, which prevented isolation behavior and other care measures against COVID-19. This fact may have prolonged the duration of the contagion wave as well as its magnitude, since the protests began in 28 April 2021 and lasted for more than two months [13].

Because of this, we consider two case studies. In the first case, we assume that the two waves of contagion have the same duration. However, in the second case, we consider that both waves have a different duration. The purpose of this study is to determine whether there are significant differences between the first and second waves in each case study separately. Because the data are continuous and additionally paired, we used the FDA methodology with a pointwise functional t-test, followed by a hypothesis testing proposal based on the integral of the difference in positivity rate curves, to test the equality of functional means.

The rest of the paper is composed as follows: Section 2 offers a contextualization of the data used in the study, contextualization of the functional data, the use of the punctual t-test, and the theoretical component of our proposed test, as well as the basis of our simulation; Then, Section 3 shows the functional data built and the results of the tests carried out. Subsequently, Section 4 presents some comments on the results and, finally, we offer our acknowledgments.

2. Materials and Methods

In this section, we present a contextualization on topics of the article, and we propose a hypothesis testing for functional data.

2.1. About the COVID-19 Dataset in Colombia and the Two Case Studies

In Colombia, data on COVID-19 are officially reported by the National Health Institute (NHI) of the Colombian Ministry of Health. However, Colombia is a country divided in 32 regional sections called departments and one special capital district called Bogotá D.C. Each of these reports information on the number and type of tests performed and the number of positive cases to the NHI. This information is not properly reported in some cases, thus leaving a problem of incomplete information. In this study, we only consider departments whose information—from 20 July 2020, to 20 July 2021—is complete. Therefore, we have 23 departments whose information we use and, including Bogotá D.C., 24 regions in total. They are: Antioquia, Atlántico, Bolivar, Boyaca, Cáldas, Casanare, Cauca, Cesar, Córdoba, Cundinamarca, Guajira, Huila, Magdalena, Meta, Nariño, Norte de Santander, Putumayo, Quindío, Risaralda, Santander, Sucre, Tolima, Valle del Cauca, and Bogotá.DC.

It is suspected that the national strike that occurred in Colombia on 28 April 2021, and lasted for at least two months [13], caused people not to take personal protection measures against COVID-19 and that this lengthened the duration of the wave of contagion by SARS-CoV-2 in Colombia. Two study scenarios were considered. The first, called Case 1, assumes the measurements for the first wave of contagion from 20 November 2020, to 18 February 2021; and a second contagion wave from 19 February 2021 to 20 May 2021. That is, both waves of contagion last three months each, ignoring the national strike. The second scenario, called Case 2, assumes the first wave of contagion from 20 November 2020 to 18 February 2021, and the second contagion wave from 19 February 2021 to 20 July 2021. In other words, in the second scenario, the first wave lasts three months, and the second wave lasts 5 months. See Figure 2 and Figure 3.

It is important to note that, for each case, the data from the first and second waves have a paired behavior. For each department, the first wave is followed by the second, which constitutes before and after observations.

2.2. About Functional Data Analysis

Functional data analysis is a statistical methodology in which the objects of study are not scalar values, but continuous functions [7,14], considered as observations of a stochastic process

\{X (t) : t \in T\}

. Thus, the set of functional observations

x_{1} (t), x_{2} (t), \dots, x_{n} (t)

constitute a simple random sample of it, and each observation is called a functional datum.

The FDA allows the appropriation of a certain mathematical theory about the functions, collected in the functional analysis, since the functional observations considered are smooth curves and square-integrable. That is, if

{\{x_{i} (t)\}}_{i = 1}^{n}

is a functional random sample, associated with the stochastic process

\{X (t) : t \in T\}

, so that Equation (1) holds for

i = 1, 2, \dots n

.

\int_{T} x_{i}^{2} (t) d t < \infty .

(1)

Thus, the functional data are elements of a Hilbert vector space over the field

R

of the real numbers. This space is denoted

L_{2} ([a, b])

, where the interval

[a, b]

, is the domain of the elements of

L_{2}

, which, without loss of generality, can be moved to the interval [0, 1] [15,16].

In FDA, the underlying stochastic process

\{X (t) : t \in T\}

is defined as a second-order stochastic process [17], so its expected value exists in its functional form, defined as in Equation (2):

\begin{matrix} μ : & T ⟶ & R \\ t ⟶ & μ (t) = \int_{Ω} X (t, ω) d P (ω), \end{matrix}

(2)

on the probability space

(Ω, A, P)

.

Although the FDA considers that observations of the stochastic process are continuous functions, the curves of these functions must be obtained through punctual observations of the phenomenon. For this, there are different methodologies; however, we use the vector space structure of

L_{2} ([0, 1])

to assume that the observations are elements of a finite-dimensional subspace

H

of the space

L_{2} ([0, 1])

. This allows us to assume the existence of a finite basis

B = {\{ϕ_{j} (t)\}}_{j = 1}^{k}

of size k for the subspace

H

, where each element of B is a basis function, and k is the dimension of

H

. Thus, each functional datum

x_{i} (t)

is uniquely expressed as a linear combination of elements of B, and elements of

R

called coefficients, as in Equation (3):

x_{i} (t) = \sum_{j = 1}^{k} c_{i, j} ϕ_{j} (t),

(3)

where

x_{i} (t)

is the i-th functional datum,

ϕ_{j} (t)

is the j-th basis function,

c_{i, j}

is the j-th coefficent for the i-th functional datum, for

i = 1, 2, \dots, n

,

j = 1, 2, \dots, k

, and

t \in [0, 1]

.

Obtaining the values of

c_{i, j} \in R

are carried out in this work by least squares [1]. This allows for an estimate of the mean function

μ (t)

, through the Expression (4):

\bar{X} (t) = n^{- 1} \sum_{i = 1}^{n} x_{i} (t),

(4)

where

{\{x_{i} (t)\}}_{i = 1}^{n}

is a set of functional data [8].

There are different types of bases to generate functional data. In this study, we consider a basis of functions of the form defined as in Equation (5):

ϕ_{j} (t) = \sqrt{2} C o s (j - 1) π t,

(5)

for j

= 2, 3, \dots, k

, since, together with the constant function

ϕ_{1} (t) = 1

, the set of functions

{\{ϕ_{j}\}}_{j = 1}^{k}

constitutes a finite orthonormal basis for a vector subspace of dimension k of the Hilbert space

L_{2} ([0, 1])

[18,19].

2.3. Hypothesis Testing in Functional Data

A hypothesis test for functional data stems from the same theoretical foundation as a hypothesis test for scalar data. Accordingly, an initial hypothesis is generated about a population parameter, known as the null hypothesis and denoted as

H_{0}

, which is contrasted with a hypothesis, generally complementary about the same parameter, called the alternative hypothesis and denoted as

H_{1}

[20].

Since the objective of our study is to determine the existence or not of significant differences between the two waves of contagion of COVID-19, and as the positivity rate has a continuous behavior, the functional data methodology is used. We use the functional mean as a parameter to define a hypothesis test, defining as

μ_{X}

and

μ_{Y}

the functional means of the stochastic processes

\{X (t) : t \in T\}

and

\{Y (t) : t \in T\}

associated with the positivity rate of COVID-19 in the first and second waves of contagion, respectively. With this, the contrast of hypotheses is raised in Equation (6):

\begin{matrix} H_{0} : & μ_{X} & = μ_{Y} \\ H_{1} : & μ_{X} & \neq μ_{Y} . \end{matrix}

(6)

Based on the data samples, a statistical test is generated and calculated. The value of the statistic is located within the null distribution, which is the probability distribution that would apply to the statistic if the null hypothesis was correct; next, using the null distribution, a p-value is calculated, which indicates the probability that the test statistic is at least as extreme as the observed statistic [20].

On the tests of hypotheses in functional data, one can find very diverse literature from different approaches such as the [21,22,23,24,25,26,27,28]. However, as can be seen from early work on functional data, hypothesis testing for functional data can be performed using a pointwise t-test [1], based on the idea of fixing a value

t \in [0, 1]

. Thus, the hypothesis test of the Equation (6) is performed for each of the infinite points

t \in [0, 1]

, and since the values of the images of the functions

x_{i} (t)

and

y_{i} (t)

are scalars for each fixed t, application of a t-test for scalar data is allowed.

We make the statistical comparison in a first instance with a pointwise t-test, which is a natural extension of a t-test but now in the functional context. This methodology has the limitation that, when performing the scalar tests on the domain values

[0, 1]

, the p-value is a continuous function, so it is difficult to generate a global conclusion on the contrast. For this reason, we now propose a different approach to hypothesis testing for functional data, which produces a global p-value that helps decide on the entire domain and not about sections of it.

2.4. Another Hypothesis Test Approach for Functional Data

In our case study, the data of interest, in addition to being of a continuous nature, exhibit paired behavior. That is, for each department, there is a curve of the first wave and a curve of the second wave. Therefore, we have 24 pairs of curves. Thus, we proceed as in the scalar case and restate the contrast as in Equation (7):

\begin{matrix} H_{0} : & μ_{X} - μ_{Y} & = 0_{L_{2} ([0, 1])} \\ H_{1} : & μ_{X} - μ_{Y} & \neq 0_{L_{2} ([0, 1])}, \end{matrix}

(7)

where in this case

0_{L_{2} ([0, 1])}

refers to the zero function in

L_{2} ([0, 1])

.

The similarity of the difference curve of any two continuous functions with the zero function is an indicator that both functions are similar. Therefore, if the null hypothesis is true, the integral of the difference curve must be zero, and we can obtain the contrast of Equation (8):

\begin{matrix} H_{0}^{'} : & \int_{T} (μ_{X} (t) - μ_{Y} (t)) d t & = \int_{T} 0_{L_{2} (T)} d t \\ H_{1}^{'} : & \int_{T} (μ_{X} (t) - μ_{Y} (t)) d t & \neq \int_{T} 0_{L_{2} (T) d t} . \end{matrix}

(8)

For the hypothesis test of Equation (8), we present a test statistic based on the average of the integral of the functional differences, denoted by the acronym for Mean Integral of Differences (MID), which is arrived at by using a bit of algebra on the sample estimates of the parameters. Thus, given two sets of functional data of a paired nature

{\{x_{i}\}}_{i = 1}^{n}

and

{\{y_{i}\}}_{i = 1}^{n}

, the MID contrast statistic is defined by Equation (9):

M I D = n^{- 1} \sum_{i = 1}^{n} \int_{0}^{1} d_{i} (t) d t,

(9)

where

d_{i} (t)

is defined as in Equation (10):

d_{i} (t) = x_{i} (t) - y_{i} (t),

(10)

for each

i = 1, 2, \dots, n .

The form of our proposed statistical test,

M I D \sim N (0, σ)

, follows a normal distribution with mean zero and variance

σ^{2}

, guaranteed by the central limit theorem. The contrast can be done by standardizing MID using Expression (11):

S . M I D = \frac{M I D - 0}{σ} .

(11)

Thus,

σ

can be obtained from Equation (12):

S_{d} = \frac{σ}{\sqrt{n}},

(12)

where

S_{d}^{2}

is the sample variance obtained from set

{\{\int_{0}^{1} d_{i} (t) d t\}}_{i = 1}^{n}

. In this way, a scalar-value can be computed as

2 P (Z \geq |S . M I D|)

, where Z is a real random variable, such that

Z \sim N (0, 1)

.

In addition to the theoretical approach, to apply our contrast statistic to the specific study cases, we decided to also run a simulation process to find a null distribution and perform the test—considering that the null hypothesis is that the two paired functional samples come from populations with the same mean. Thus, we simulate paired scalar points from a common functional mean for the two groups and use them to construct the curves that constitute the functional data samples of size 24; then, we apply the test statistic. The process is repeated 4000 times for that sample size.

Moreover, a quick Shapiro–Wilk test is performed on the values of the integral of the differences of the functions to assess whether there is evidence that the resulting data do not come from a normal distribution. The p-value for this test is reported later in the results section, which supports the use of the proposed methodology.

3. Results

In this section, we present the results of our study applying the FDA methodology to the positivity data for COVID-19 in Colombia in both cases considered.

3.1. Constructed Functional Data

In Figure 4, we show the functional data of the COVID-19 positivity rate in Colombia in case 1, using the orthonormal basis of the Equation (5). Here, in the top left and right panels, functional data of positivity rate are shown for the first and second waves, respectively; meanwhile, in the bottom left panel, the functional means of positivity rate of both waves of contagion by COVID-19 are shown, which are the goal of the comparison, to be conducted through the curves of difference of the positivity ratio in both waves of contagion by COVID-19, shown in the lower right panel.

Similarly, the functional data of COVID-19 positivity rate in Colombia in case 2 are depicted in Figure 5. We respectively show in the upper left and right panels the functional data of positivity rate for the first and second waves. It is possible to appreciate a certain difference in the trend of positivity between the two waves. This can also be seen in the functional means of both waves of contagion by COVID-19 shown in the bottom left panel of both waves of contagion by COVID-19, which are the goal of comparison in case 2. In addition, a different trend is observed between the curves of the differences in the rate of positivity shown in the figure, with respect to the curves of difference in case 1.

3.2. Pointwise Hypothesis Contrast for Curves

As stated above, the pointwise t-test assumes that, for each

t \in [0, 1]

, a scalar t-test can be performed with the images of the functions evaluated at point t. That is, a t-test for the two groups of scalar values

{\{x_{i} (t)\}}_{i = 1}^{n}

and

{\{y_{i} (t)\}}_{i = 1}^{n}

, results from evaluating the functions

x_{i}

and

y_{i}

at the same fixed point t, for

i = 1, 2, \dots, n

. In this case, the contrast is defined as in Equation (13):

\begin{matrix} H_{0} : & μ_{X (t)} - μ_{Y (t)} & = 0 \\ H_{1} : & μ_{X (t)} - μ_{Y (t)} & \neq 0, \end{matrix}

(13)

where

μ_{X (t)}

and

μ_{Y (t)}

are scalar parameters, since t is a fixed value. This test is performed using the test statistic of Equation (14):

\frac{\bar{X} (t) - \bar{Y} (t)}{s d / \sqrt{n}},

(14)

where

s d

is the standard deviation of the scalar values

{\{x_{i} (t) - y_{i} (t)\}}_{i = 1}^{n}

. In this way, we take 1000 values of t within the interval

[0, 1]

and perform the test for each of these. We then obtain 1000 p-values, which are shown in Figure 6: in the left panel for case 1 and in the right panel for case 2.

Note that, so far, it is not possible to determine whether there are significant differences between the two waves of COVID-19 contagion through the positivity curves, in a global way.

3.3. Another Hypothesis Test Approach for Functional Data

Under the previously exposed methodology, two groups of curves of size 24 were simulated in pairs, under the null hypothesis that the functional means are equal, and the MID test statistic was calculated in the sample. This process was repeated 4000 times separately for each case of study, with which 4000 values of the MID statistic were obtained. After performing the simulation process to obtain the null distribution in the form of a histogram, the value of the test statistic was calculated in the real functional data of positivity rate for COVID-19 in Colombia in both cases of study. Using the histogram found under simulation, the critical values corresponding to a significance of 0.05 were found by frequency. The histograms of the values found, together with the value of the statistic and the respective critical values, are shown in Figure 7: in the left panel for case 1 and in the right panel for case 2.

In addition to the above, in Table 1, we show the p-values obtained in the Shapiro–Wilk test performed on the 24 pieces of data from the integrals of the difference of the paired functional data in the two study cases. In addition, we show the p-value of the test statistic using the theoretical distribution of the test statistic, and we also show the values of the test statistic in each case and their respective p-values found under simulation and the critical values of the distribution null found under simulation.

Note that now, with the use of the scalar p-values found with our proposal, it is possible to decide on the existence of significant differences between both waves of COVID-19 contagion in a global way. Thus, for case 2, we can say that there are significant differences between the two waves of contagion, since the p-value in this case is 0.00001, under the theoretical null distribution, and 0.0015 under the simulated null distribution. Therefore, the hypothesis that the functional means of the positivity rate are the same in both waves of contagion by COVID-19 is rejected.

In turn, for the first case, since the p-values are 0.08906 under the theoretical null distribution, and 0.0875 under the simulated null distribution, the hypothesis that the functional means of the positivity rate are the same in both waves of contagion by COVID-19 is not rejected. It is not rejected by a very small margin with respect to the reference value of 0.05 significance.

4. Discussion

It is important to point out that, as Figure 6 shows, the point t-test for functional data allows us to evaluate the sections of the domain of the functions where there are significant differences. In terms of the cases of study, the dates between which there is a greater difference between the two contagion waves could be identified. Nevertheless, the pointwise t-test is insufficient to determine whether or not the two contagion waves are significantly different in each case study.

Our proposed test for hypotheses testing for paired data allow a global decision to be made based on the scalar p-value, however. The application of our contrast statistic allows us to visualize—as can be seen in Figure 7 (left panel)—that, for Case 1, when a significance of 0.05 is taken, the contrast statistic does not reject the hypothesis of equal means; i.e., at a significance of 0.05, there is no evidence that there are significant differences between the two contagion waves, even though the p-value of 0.082 is relatively close. Thus, if the significance is taken at 0.1, the decision would be to reject the null hypothesis, although again, with a very close margin. With regard to Case 2, as shown in Figure 7(rigth panel), the p-value found with our statistic is 0.0015, so we can say that there is sufficient evidence that the two contagion waves are significantly different.

Because of the above, although the p-value in case 1 leaves some doubt, it is important to highlight the difference between the p-values in both cases from a broader point of view, which seems to support the idea that the two case studies are remarkably different, and that the national strike in Colombia should not be ignored when analyzing epidemiological behavior, since the case studies suggest a possible change in the inclusion of positive data due to noncompliance with care measures during the national strike.

Author Contributions

Both authors contributed equally to this work. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data used in this study correspond to official data of the confirmed positive cases of COVID-19 in Colombia, data PCR tests for COVID-19 in Colombia, and data on antigen tests for COVID-19 in Colombia, respectively available at: https://www.datos.gov.co/Salud-y-Protecci-n-Social/Casos-positivos-de-COVID-19-en-Colombia/gt2j-8ykr/data (accessed on 14 September 2022), https://www.datos.gov.co/Salud-y-Protecci-n-Social/Pruebas-PCR-procesadas-de-COVID-19-en-Colombia-Dep/8835-5baf (accessed on 14 September 2022), and https://www.datos.gov.co/Salud-y-Protecci-n-Social/Ant-geno-procesadas-de-COVID-19-en-Colombia-Depart/ci85-cyhe/data (accessed on 14 September 2022).

Acknowledgments

This paper is partially supported by the research group FQM-307 of the Government of Andalusia (Spain) and by the project PID2020-113961GB-I00 of the Spanish Ministry of Science and Innovation (also supported by the FEDER programme). The authors also acknowledge the financial support of the Consejería de Conocimiento, Investigación y Universidad, Junta de Andalucía (Spain), and the FEDER programme for project A-FQM-66-UGR20. Additionally, the authors acknowledge financial support by the IMAG–María de Maeztu grant CEX2020-001105-M/AEI/10.13039/501100011033.

Conflicts of Interest

The authors declare no conflict of interest.

References

Ramsay, J.; Silverman, B. Functional Data Analysis, 2nd ed.; Springer Series in Statistics; Springer: New York, NY, USA, 2005. [Google Scholar]
Stewart, K.J.; Darcy, D.P.; Daniel, S.L. Opportunities and Challenges Applying Functional Data Analysis to the Study of Open Source Software Evolution. Stat. Sci. 2006, 21, 167–178. [Google Scholar] [CrossRef] [Green Version]
Jank, W.; Shmueli, G. Functional Data Analysis in Electronic Commerce Research. Stat. Sci. 2006, 21, 155–166. [Google Scholar] [CrossRef] [Green Version]
Ferraty, F. Recent Advances in Functional Data Analysis and Related Topics; Springer: Berlin/Heidelberg, Germany, 2011. [Google Scholar]
Horváth, L.; Kokoszka, P. Inference for Functional Data with Applications; Springer Series in Statistics; Springer: New York, NY, USA, 2012. [Google Scholar]
Sørensen, H.; Goldsmith, J.; Sangalli, L.M. An introduction with medical applications to functional data analysis. Stat. Med. 2013, 32, 5222–5240. [Google Scholar] [CrossRef] [PubMed]
Srivastava, A.; Klassen, E.P. Functional and Shape Data Analysis; Springer Series in Statistics; Springer: New York, NY, USA, 2016. [Google Scholar]
Ramsay, J.O.; Silverman, B.W. Applied Functional Data Analysis: Methods and Case Studies; Springer Series in Statistics; Springer: New York, NY, USA, 2002. [Google Scholar]
Hsing, T.; Eubank, R. Theoretical Foundations of Functional Data Analysis, with an Introduction to Linear Operators, 1st ed.; John Wiley & Sons Ltd.: West Sussex, UK, 2015. [Google Scholar]
Dallal, A.A.; AlDallal, U.; Dallal, J.A. Positivity rate: An indicator for the spread of COVID-19. Curr. Med. Res. Opin. 2021, 37, 2067–2076. [Google Scholar] [CrossRef] [PubMed]
Fu, Y.; Li, Y.; Guo, E.; He, L.; Liu, J.; Yang, B.; Li, F.; Wang, Z.; Li, Y.; Xiao, R.; et al. Dynamics and Correlation Among Viral Positivity, Seroconversion, and Disease Severity in COVID-19. Ann. Intern. Med. 2021, 174, 453–461. [Google Scholar] [CrossRef]
Furuse, Y.; Ko, Y.K.; Ninomiya, K.; Suzuki, M.; Oshitani, H. Relationship of Test Positivity Rates with COVID-19 Epidemic Dynamics. Int. J. Environ. Res. Public Health 2021, 18, 4655. [Google Scholar] [CrossRef] [PubMed]
Juliana, R. Colombia 2021: Between Crises and Hope. Rev. Cienc. PolíTica 2022, 42, 255–280. [Google Scholar] [CrossRef]
Clarkson, D.B.; Fraley, C.; Gu, C.C.; Ramsay, J.O. S+ Functional Data Analysis; Springer: New York, NY, USA, 2005. [Google Scholar]
Rudin, W. Functional Analysis, 2nd ed.; McGraw Hill: Singapore, 1991. [Google Scholar]
Royden, H.L.; Fitzpatrick, P.M. Real Analysis, 4th ed.; Prentice Hall: Hoboken, NJ, USA, 2010. [Google Scholar]
Escabias, M. Reducción de Dimensión en Regresión Logistica Funcional. Ph.D. Thesis, Universidad de Granada, Granada, Spain, 2002. [Google Scholar]
Eubank, R.L. Nonparametric Regression and Spline Smoothing, 2nd ed.; Marcel Dekker Inc.: New York, NY, USA, 1998. [Google Scholar]
Olaya, J. Metodos de Regresión no Paramétrica; Universidad del Valle: Cali, Colombia, 2012. [Google Scholar]
Kenny, J.F.; Keeping, E.S. Mathematics of Statistics. Part Two, 2nd ed.; D. Van Nostrand Company, Inc.: New York, NY, USA, 1951. [Google Scholar]
Cox, D.D.; Lee, J.S.; Follen, M. A two sample test for functional data. Commun. Stat. Appl. Methods 2015, 22, 121–135. [Google Scholar]
Fan, J.; Lin, S. Test of significance when data are curves. Am. Stat. Assoc. 1998, 93, 111–122. [Google Scholar] [CrossRef]
Zhang, J.; Chen, J. Statical Inferences For Functional Data. Ann. Stat. 2007, 35. [Google Scholar] [CrossRef] [Green Version]
Cuevas, A.; Febrero, M.; Fraiman, R. A anova test for functional data. Comput. Stat. Data Anal. 2004, 47, 111–122. [Google Scholar] [CrossRef]
Degras, D. Simultaneous Confidence Bands for the Mean of Functional Data. Wiley Interdiscip. Rev. Comput. Stat. 2017, 9, e1397. [Google Scholar] [CrossRef] [Green Version]
Cuesta-Albertos, J.; Febrero-Bande, M. A simple multiway anova for funcional data. Bus. Econ. 2007, 19, 537–557. [Google Scholar]
Qiu, Z.; Chen, J.; Zhang, J.T. Two-sample tests for multivariate functional data with applications. Comput. Stat. Data Anal. 2021, 157, 107–160. [Google Scholar] [CrossRef]
Melendez, R.; Giraldo, R.; Leiva, V. Sign, Wilcoxon and Mann-Whitney Tests for Functional Data: An Approach Based on Random Projections. Matematics 2021, 9, 44. [Google Scholar] [CrossRef]

Figure 1. Two waves of contagion by COVID-19 between the dates 19 July 2021 and 20 July 2022.

Figure 2. Case 1: Two waves of contagion by COVID-19 in Colombia without a national strike. The dotted red lines mark the start and end of the first wave of contagion, which occurred between 20 November 2020 and 18 February 2021. The dashed blue lines mark the start and end of the second wave of contagion, which occurred between 19 February 2021 and 20 May 2021.

Figure 3. Case 2: Two waves of contagion by COVID-19 in Colombia with a national strike. The dotted red lines mark the start and end of the first wave of contagion, which occurred between 20 November 2020 and 18 February 2021. The dashed blue lines mark the start and end of the second wave of contagion, which occurred between 19 February 2021 and 20 July 2021.

Figure 4. Functional positivity data for COVID-19 in Colombia for Case 1: Functional positivity in the first wave (top left); functional positivity in the second wave (top right); functional means of positivity in the first wave in solid line and the second wave in dashed line (bottom left); positivity difference curves (bottom right).

Figure 5. Functional positivity data for COVID-19 in Colombia for Case 2: Functional positivity in the first wave (top left); functional positivity in the second wave (top right); functional means of positivity in the first wave in solid line and the second wave in dashed line (bottom left); positivity difference curves (bottom right).

Figure 6. P-values obtained for the values of t in interval [0, 1] in the solid line and the reference line of 0.05 of significance in the dashed line for case 1 (left) and case 2 (right).

Figure 7. Histogram of the 4000 MID values obtained under simulation, and the MID contrast statistic calculated on the real positivity functional data in dashed line and the critical values in dotted lines for case 1 (left panel) and case 2 (right panel).

Table 1. Summary of our test results in both cases.

Case	Test Statistic	p-Value under Simulation	p-Value under Theoretical Distribution	p-Value of Shapiro–Wilk Test
Case 1	−2.77	0.0875	0.08906	0.6029165
Case 2	−4.8	0.0015	0.00001	0.7006806

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Urbano-Leon, C.L.; Escabias, M. Comparison of Positivity in Two Epidemic Waves of COVID-19 in Colombia with FDA. Stats 2022, 5, 993-1003. https://doi.org/10.3390/stats5040059

AMA Style

Urbano-Leon CL, Escabias M. Comparison of Positivity in Two Epidemic Waves of COVID-19 in Colombia with FDA. Stats. 2022; 5(4):993-1003. https://doi.org/10.3390/stats5040059

Chicago/Turabian Style

Urbano-Leon, Cristhian Leonardo, and Manuel Escabias. 2022. "Comparison of Positivity in Two Epidemic Waves of COVID-19 in Colombia with FDA" Stats 5, no. 4: 993-1003. https://doi.org/10.3390/stats5040059

Article Menu

Comparison of Positivity in Two Epidemic Waves of COVID-19 in Colombia with FDA

Abstract

1. Introduction

2. Materials and Methods

2.1. About the COVID-19 Dataset in Colombia and the Two Case Studies

2.2. About Functional Data Analysis

2.3. Hypothesis Testing in Functional Data

2.4. Another Hypothesis Test Approach for Functional Data

3. Results

3.1. Constructed Functional Data

3.2. Pointwise Hypothesis Contrast for Curves

3.3. Another Hypothesis Test Approach for Functional Data

4. Discussion

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI