Next Article in Journal
Productivity vs. Evenness in the U.S. Financial Market: A Business Ecosystem Perspective
Next Article in Special Issue
Stochastic Stabilization of Dual-Layer Rumor Propagation Model with Multiple Channels and Rumor-Detection Mechanism
Previous Article in Journal
Alternating Direction Method of Multipliers-Based Constant Modulus Waveform Design for Dual-Function Radar-Communication Systems
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A New Analysis of Real-Time Fatality Rate in the Initial Stage of COVID-19

School of Mathematics and Physics, China University of Geosciences, Wuhan 430074, China
*
Author to whom correspondence should be addressed.
Entropy 2023, 25(7), 1028; https://doi.org/10.3390/e25071028
Submission received: 4 May 2023 / Revised: 22 June 2023 / Accepted: 4 July 2023 / Published: 6 July 2023
(This article belongs to the Special Issue Modeling and Control of Epidemic Spreading in Complex Societies)

Abstract

:
Mortality is one of the most important epidemiological measures and a key indicator of the effectiveness of potential treatments or interventions. In this paper, a permutation test method of variance analysis is proposed to test the null hypothesis that the real-time fatality rates of multiple groups were equal during the epidemic period. In light of large-scale simulation studies, the proposed test method can accurately identify the differences between different groups and display satisfactory performance. We apply the proposed method to the real dataset of the COVID-19 epidemic in mainland China (excluding Hubei), Hubei Province (excluding Wuhan), and Wuhan from 31 January 2020 to 30 March 2020. By comparing the differences in the disease severity for differential cities, we show that the severity of the early disease of COVID-19 may be related to the effectiveness of interventions and the improvement in medical resources.

1. Introduction

An epidemic is a disease for which observation value exceeds the expected value. Once an epidemic affects a large population or spreads globally, it is called a pandemic [1]. In recent decades, the incidence and mortality of emerging infectious diseases have increased worldwide, causing significant social, political, and economic damage [2]. For example, several large-scale epidemics that have broken out in recent years include Severe Acute Respiratory Syndrome (SARS), H1N1 influenza, and the ebola virus, which have undoubtedly had a huge impact on the social, political, economic, and other aspects of the infected countries. In 2019, the initial outbreak of COVID-19 in Wuhan, Hubei Province, China, spread rapidly around the world, posing a major threat to global public health. According to the World Health Organization [3], through 1 January 2023, more than 656 million confirmed cases of COVID-19 and more than 6.6 million deaths have been reported worldwide.
Mortality is an important epidemiological measurement, which is used to indicate the severity of the disease and measure the virulence of the disease. The World Health Organization has defined mortality as the ratio of the cumulative number of deaths to the cumulative number of confirmed cases. Mortality is used as an indicator to measure the severity of the disease [4]. Yip et al. [5] found that, in the early stage of the epidemic, because the infection was not over, few patients died of the disease, so the estimated fatality rate was low. With the development of the epidemic, more and more people died at the end of the disease, while more and more people were diagnosed, so the estimated rate was still low. Last [6] argued that the simple estimator is not sensitive to changes in mortality during the epidemic and will only perform well at the end of the epidemic.
Lam et al. [7] believed that the mortality of emerging epidemics may change over time, and the downward trend of mortality may reflect the effectiveness of government interventions and the improvement of medical resources. Ambreen et al. [8] considered the impact of meteorological factors on COVID-19. Chen et al. [9] also believed that pairing assistance is an effective way to curb COVID-19. During the outbreak of an epidemic, Lam et al. [7] recommended using real-time fatality rate to measure the severity of the epidemic, rather than traditional mortality. The real-time fatality rate is defined as the probability of death based on the counting process method conditional on death or recovery. Compared with traditional mortality, the real-time fatality rate has been proven to be more sensitive to capturing changes in mortality during the course of an epidemic. Reich et al. [10] defined the relative fatality rate as the ratio of the fatality rate of a group to that of another reference group and compared the fatality rate of a specific group using a generalized linear model framework. Chen et al. [11] used the generalized linear model framework to estimate the fatality rate according to the maximal likelihood method. However, the assumptions about the real-time fatality rate in their methods are very strict and may be more suitable for chronic diseases than for emerging infectious diseases. In order to detect changes in the real-time fatality rate, Yip et al. [12] proposed a competitive risk model implemented by the counting process to estimate the real-time fatality rate. Yip et al. [5] considered the chain multinomial model to estimate the real-time fatality rate. Lam et al. [7] developed a single-sample sequential test for the null hypothesis of constant fatality rate. Qu et al. [13] developed a multi-sample sequential test for the null hypothesis of real-time fatality rate.
During the highly infectious epidemic outbreak, the government needs to formulate effective interventions as soon as possible to effectively curb the spread of the disease to reduce mortality, minimize the severity of the disease, and ensure the safety of people’s lives. Due to various reasons, it is difficult to collect complete and detailed data. However, it is relatively easy to obtain summary data on confirmed cases, deaths, and recoveries of COVID-19 from the official channels of health departments in various regions.
In order to solve the above problem, it is necessary to carry out relevant statistical tests and compare the severity of diseases between multiple groups according to a simple data structure. Inspired by Yip et al. [12], we propose the analysis of variance using permutation test for real-time fatality rates among different groups over a fixed time period. With the accumulation of statistical evidence, the null hypothesis can be rejected. Under the proposed testing procedure, we can also test the differences in real-time fatality rate between different regions, different treatment groups, and different age groups, so as to provide information for the health sector. We recall that Zhao et al. [14] tested the clinical symptoms of patients using Remdesivir and other therapeutic drugs through survival analysis of log-rank and found that the survival rate of patients using Remdesivir was significantly improved within 8 weeks. In essence, the potential use of multi-group trials is to compare the changes in the real-time fatality rate with Remdesivir and standard treatment. In the empirical analysis section, we compare the differences in real-time fatality rate among different regions. The results suggest that the effectiveness of interventions and medical resource assistance implemented by government departments during public health emergencies are useful and critical.
In Section 2, the test statistic is put forward to study the asymptotic distribution. In Section 3, the relevant knowledge of the permutation test is introduced. In Section 4, simulation experiments are carried out to evaluate the effectiveness of the proposed test method in different scenarios. In Section 5, the proposed test procedure is applied to the comparison of disease severity among three independent clusters during the outbreak: mainland China (excluding Hubei Province), Hubei Province (excluding Wuhan), and Wuhan. In Section 6, we summarize the provided results.

2. Preliminaries

2.1. Test Statistic

In this paper, days are taken as the unit, and time 0 can be set as the day where observation is implemented for a particular group. Time [ 0 , t ] is divided into T fixed intervals, the research objects are divided into k groups according to certain rules, and the sample size of each group is m i ( i = 1 , 2 , , k ) . Information about the number of cases, deaths, and recoveries in the group at the end of each interval is collected in order from T = 0 . Denote the numbers of deaths, recoveries, and hospitalizations in the i-th group at the end of day t by n i , D ( t ) , n i , R ( t ) , and n i , H ( t ) , respectively. We further denote the cumulative numbers of deaths and recoveries in the group at the end of day t by N i , D ( t ) and N i , R ( t ) .
Assuming that the number of confirmed cases in the i-th group at the end of day t 1 is H i ( t 1 ) , they are independently and identically distributed for each confirmed patient and will be in one of the following three states on day t: death, recovery, hospitalization. That is,
H i ( t 1 ) = n i , D ( t ) + n i , R ( t ) + n i , H ( t ) ,
assuming that the probability of death is p i , D ( t ) , the probability of recovery is p i , R ( t ) , and the probability of hospitalization is p i , H ( t ) . According to the multinomial distribution proposed by Yip et al. [5], we have
P X 1 = n i , D ( t ) , X 2 = n i , R ( t ) , X 3 = n i , H ( t ) = H i ( t 1 ) ! n i , D ( t ) ! n i , R ( t ) ! n i , H ( t ) ! p i , D ( t ) n i , D ( t ) p i , R ( t ) n i , R ( t ) p i , H ( t ) n i , H ( t ) .
According to Yip et al. [12], for group i, death and recovery are regarded as two competing risks. To consider the real-time fatality rate on day t, the real-time fatality rate is defined as
p i ( t ) = p i , D ( t ) p i , D ( t ) + p i , R ( t ) ,
and the kernel function of the probability density function of Equation (1) is
H ( p ) = p i , D ( t ) n i , D ( t ) p i , R ( t ) n i , R ( t ) p i , H ( t ) n i , H ( t ) ,
therefore the logarithmic likelihood function of Equation (1) is
l ( p ) = log H ( p ) = n i , D ( t ) log p i , D ( t ) + n i , R ( t ) log p i , R ( t ) + n i , H ( t ) log p i , H ( t ) .
For p i , H ( t ) = 1 p i , D ( t ) p i , R ( t ) , set x 1 = p i , D ( t ) , x 2 = p i , H ( t ) , and p i , H ( t ) to derivation of x 1 and x 2 , respectively. We have p i , H ( t ) x 1 = p i , H ( t ) x 2 = 1 , that is,
p i , H ( t ) p i , D ( t ) = p i , H ( t ) p i , R ( t ) = 1 ,
so
log p i , H ( t ) p i , D ( t ) = log p i , H ( t ) p i , R ( t ) = 1 p i , H ( t ) .
Removing the redundancy term, we can regard l ( p ) as a function of p i , D ( t ) + p i , R ( t ) , with derivations of p i , D ( t ) and p i , R ( t ) respectively on l ( p ) , and obtain the following likelihood equation:
l ( p ) p i , D ( t ) = n i , D ( t ) p i , D ( t ) n i , H ( t ) p i , H ( t ) ,
l ( p ) p i , R ( t ) = n i , R ( t ) p i , R ( t ) n i , H ( t ) p i , H ( t ) .
Let Equations (5) and (6) be equal to 0; their maximum likelihood estimation solution satisfies p ^ i , D ( t ) = p ^ i , H ( t ) n i , H ( t ) n i , D ( t ) and p ^ i , R ( t ) = p ^ i , H ( t ) n i , H ( t ) n i , R ( t ) , respectively. Since p ^ i , D ( t ) + p ^ i , R ( t ) + p ^ i , H ( t ) = 1 , the maximum likelihood estimation value of p i , D ( t ) and p i , R ( t ) is
p ^ i , D ( t ) = n i , D ( t ) H i ( t 1 ) , p ^ i , R ( t ) = n i , R ( t ) H i ( t 1 ) .
Therefore, the maximum likelihood estimator of p i ( t ) is defined by
p ^ i ( t ) = p ^ i , D ( t ) p ^ i , D ( t ) + p ^ i , R ( t ) = n i , D ( t ) n i , D ( t ) + n i , R ( t ) .
With the passing of time, the severity of the epidemic will change, and the above-mentioned real-time fatality rate can be used to compare the changes in the severity of the epidemic among different groups. Therefore, the following hypothesis
H 0 : p 1 ( t ) = p 2 ( t ) = = p k ( t ) , H 1 : p i ( t ) , 1 i k , not all equal
is proposed by considering death and recovery as two competing risks. It is easy to see that the above hypothesis is equal to
H 0 : p 1 , D ( t ) p 1 , R ( t ) = p 2 , D ( t ) p 2 , R ( t ) = = p k , D ( t ) p k , R ( t ) , H 1 : p i , D ( t ) p i , R ( t ) , 1 i k , not all equal ,
when the null hypothesis holds true, and we expect that the two competing risks of death and recovery are similar during the observation period. Therefore, we propose the following expression of the sum of squares for factor ( S A ) and the sum of squares for error ( S E ); that is,
S A = i = 1 k m i j = 1 t n i , D ( j ) n i , R ( j ) m i i = 1 k j = 1 t n i , D ( j ) n i , R ( j ) k m i 2 , f A = k 1 ; S E = i = 1 k j = 1 t n i , D ( j ) n i , R ( j ) j = 1 t n i , D ( j ) n i , R ( j ) m i 2 , f E = i = 1 k m i k .
The following test statistic is obtained:
F = S A / f A S E / f E = MSA MSE .

2.2. The Asymptotic Distribution of the Test Statistic

Assume that P = ( p 1 , p 2 , , p n ) T is a column vector of an n-dimensional normal random variable. Let A be an n × n real symmetric matrix; then P T A P is called the quadratic form of A. Denote the rank of A by r. According to the knowledge of linear algebra, it can be seen that the rank of the quadratic P T A P is r, for short rank ( A ) = rank ( P T A P ) = r .
Suppose that the n observation values p i ( j ) ( i = 1 , , k ; j = 1 , , m i ) in one-way analysis of variance are n independent random variables and that p i ( j ) N ( μ i , σ 2 ) . Set J n is the n-order square matrix with all elements 1, I n is the n-order unit matrix, and C = d i a g ( C 1 , C 2 , , C k ) is a block diagonal matrix, where C i = J m i / m i ; that is:
I n = 1 0 0 0 1 0 0 0 1 n × n , J n = 1 1 1 1 1 1 1 1 1 n × n
and
C = J m 1 / m 1 0 0 0 J m i / m i 0 0 0 J m k / m k n × n .
Based on the above assumptions, the following three basic quadratic forms are obtained through calculation:
P T I n P = i = 1 k j = 1 m i p i ( j ) 2 ; P T J n P = T 2 , where T = i = 1 k j = 1 m i p i ( j ) ; P T C P = i = 1 k T i 2 / m i , where T i = j = 1 m i p i ( j ) .
Through the quadratic form in (10), Mao et al. [15] proved that S A , S E , and the sum of squares for total ( S T ) are quadratic forms of normal random vectors, whose ranks are equal to their respective degrees of freedom; therefore, we have
S A = P T A 1 P , A 1 = C J n / n , rank ( A 1 ) = k 1 ; S E = P T A 2 P , A 2 = I n C , rank ( A 2 ) = n k ; S T = S A + S E = P T A P , A = I n J n / n , rank ( A ) = n 1 .
Under the null hypothesis, we also have
S A σ 2 = P T A 1 P σ 2 χ 2 ( k 1 ) , S E σ 2 = P T A 2 P σ 2 χ 2 ( n k ) .
S A and S E are independent, so
F = : S A / ( k 1 ) S E / ( n k ) = MSA MSE F ( k 1 , n k ) .
Here, F is the probability distribution of the test statistic required for one-way analysis of variance. Then, for the given significance level α > 0 , it can be determined that the rejection region is
W = F > F 1 α ( k 1 , n k ) .
Since the data used in the analysis of variance need to satisfy the assumptions of normality, homogeneity of variance and independence are needed; however, since the data in this study do not satisfy these assumptions, we will use the permutation test for the variance analysis, which is introduced in detail in the next section.

3. Permutation Test

When the data do not satisfy the premise assumptions of normality, homogeneity of variance, or independence in the variance analysis, the variance analysis cannot be conducted [16]. In this case, statistical methods based on randomization and resampling can be used for testing. We use a widely applicable statistical method from the idea of randomization, which is called a permutation test. Xue et al. [17] pointed out that the permutation test has obvious advantages over traditional statistical methods because it does not need to make presuppositions about the distribution and is conducted solely based on the information contained in the observed samples.
The core problem to be solved by using the permutation test for variance analysis is how to estimate the probability distribution of the test statistic under the condition that the null hypothesis is established. The basic thought is that if the null hypothesis that states that there is no significant difference in the mean of the sample data from k groups is true, then the null hypothesis cannot be overturned by calculating the observation value (denoted as F 0 ) of its test statistics based on the observed k independent samples. At this time, if k samples are mixed and then randomly divided into k new samples, also known as permutation samples, the observed values of the test statistics are calculated again and tested, and the same inferred conclusion will be obtained.
Assuming that it is necessary to test whether there is any difference between the effect among k groups, then set a total of n = i = 1 k m i ( i = 1 , , k ) experimental units, and divide these experimental units into k parts. Each part contains m i units, and k different treatments are applied separately. The test results obtained from each treatment are recorded as Y i = ( y i 1 , , y i j , y i m i ) ( i = 1 , , k ; j = 1 , , m i ), and the effect value is U i = ( u i 1 , , u i j , u i m i ) ; then the assumption to be considered is
H 0 : U 1 = U 2 = = U k .
when assigning experimental units based on the principle of randomization. By arbitrarily lining up n experimental units into a row and applying k different treatments to m i units in turn, the data y i j are randomly extracted from the effect u i j . If the following test statistic is used
F = MSA MSE = i = 1 k m i ( y ¯ i y ¯ ) 2 / ( k 1 ) i = 1 k j = 1 m i ( y i j y ¯ i ) 2 ( i = 1 k m i k ) ,
then F is evaluated with the probability of 1 n ! every time; U i = ( u i 1 , , u i j , u i m i ) is arranged in a row and divided into k parts, where the value of each part is Y i ; and the value of F can be calculated according to (12). By repeating this process, n ! values of F can be obtained, denoted as F 1 , , F n i . Arrange all F 1 , , F n i into F ( 1 ) F ( 2 ) F ( n ! ) according to size, and select an appropriate F ( g ) ( g = 1 , , n ! ) according to significance level α .
When F 0 > F ( g ) , reject the null hypothesis, believing that there is a significant difference in the effect among k groups. The following is the rejection region of the permutation test:
W = F 0 > F 1 α .
Equation (13) uses the quantile of the permutation distribution as the critical value to determine the rejection region. It can be seen that both the permutation method and the parameter method need to calculate the test statistic. The difference is that the permutation method does not compare the test statistic with the theoretical distribution but rather compares the test statistic with the empirical distribution obtained by the permutation sample and judges whether there is enough evidence to reject the null hypothesis according to the extremeness of the statistical value.

4. Simulation

A large-scale simulation is carried out to evaluate the performance of the proposed approach, generating simulated data for daily confirmed cases, with deaths and recoveries unknown. This simulates the real-world epidemiological data in which only aggregated counts are reported during an epidemic outbreak. We assume that the observation period is 50 days ( T = 50 ), and it needs to take into account how real-time fatality rates changes over time in practice. Therefore, we assume that, during the 50-day observation period, when 0 < t t 0 , the daily number of confirmed cases follows the Poisson distribution of parameter λ 0 , and when t 0 < t 50 , the daily number of confirmed cases follows the Poisson distribution of parameter λ 1 . In addition, the simulation scenarios are based on the preassigned death probability and recovery probabilities on day t for group i. The daily number of deaths and recoveries are generated according to the multinomial setting in (1), and the real-time fatality rates are further calculated according to the generated simulation data. Finally, the permutation test is used for variance analysis to determine the differences in disease severity among k groups.
Under H 0 , the performance of the proposed test is evaluated by considering various scenarios in which the real-time fatality rates are equal among three groups (group A, group B, group C). From the simulation results in Table 1, we can see that the analysis of variance using the permutation test can sensitively capture the differences in real-time fatality rate among different groups, thus rejecting the null hypothesis of equal real-time fatality rate.
Figure 1 shows the multiple comparison tests of the first four scenarios at the confidence level α = 0.05 . The upper left corner shows that the confidence interval of the mean difference of each group in scenario 1 does not contain the 0 point. Therefore, we can judge that there is a significant difference in the real-time fatality rate among the three groups in scenario 1. The upper right corner shows the confidence interval of the mean difference of each group in scenario 2, and only the confidence interval of the mean difference between group A and group B contains the 0 point, which means that there is no significant difference in the real-time fatality rate between group A and group B. The lower left corner and the lower right corner respectively show the confidence interval of the mean difference of each group in scenarios 3 and 4. In these two figures, the confidence interval of mean difference of group B and group C contains the 0 point; that is, there is no significant difference in the real-time fatality rate between group B and group C under scenarios 3 and 4.

5. Empirical Analysis

COVID-19 is the most widespread global pandemic to hit humanity in the past century. It is a serious crisis for the whole world, posing a major threat to human life and health [18,19]. The first case of COVID-19 was found in Wuhan, Hubei Province, China, in December 2019. At present, China’s National Health Commission has renamed it as the Novel Coronavirus Infection. Kraemer et al. [20] found that, despite restrictions on mobility imposed in Wuhan on 23 January 2020, it still could not stop the rapid spread of COVID-19 to other cities. According to the statistics of the National Health Commission [21], as of 24:00 on 29 February 2020, a total of 79,824 confirmed cases had been reported in mainland China, including 2870 deaths; 66,907 confirmed cases and 2761 deaths had been reported in Hubei Province; and 49,122 confirmed cases and 2195 deaths had been reported in Wuhan. At first glance, the mortality in mainland China is about 3.60 % , the mortality in Hubei Province is about 4.13 % , and the mortality in Wuhan is about 4.47 % . However, it is difficult to estimate the mortality of an epidemic, especially for the outbreak of COVID-19. Medical staff all over China have to make every effort to deal with this extremely infectious and terrible disease. Next, the proposed method is applied to analyze the differences in disease severity among different regions over time.

6. Data Description

Considering that the intervention measures implemented during the epidemic and availability of medical resources may have a potential impact on the fatality rate in the initial stage of COVID-19 in China, we have divided mainland China into three different clusters, namely mainland China excluding Hubei Province, Hubei Province excluding Wuhan, and Wuhan, and we use the real-time fatality rate to analyze the differences in disease severity among different clusters. The daily number of confirmed cases, daily deaths, and daily recoveries for each cluster from the official website of the National Health Commission (http://www.nhc.gov.cn, accessed on 11 June 2020) were extracted and summarized from 31 January 2020 to 30 March 2020.
Figure 2 shows the real-time fatality rate estimator of the three independent clusters throughout the observation period. We can see that, in the initial stage of the COVID-19 epidemic, there are obvious differences in disease severity among different regions in mainland China. With the passage of time, the difference in the severity of the disease gradually decreases. In this regard, we provide some explanations to describe the observed phenomena.
In the initial stage of the epidemic, most of the confirmed cases were mainly concentrated in Hubei Province. Given the large number of patients and the lack of clinical experience with the disease, the local medical systems were overwhelmed. In Wuhan in particular, many patients were not treated in time, causing the highest mortality there, followed by Hubei Province (excluding Wuhan). On the other hand, the implementation of strict lockdown measures in Wuhan curbed the spread of the disease to a certain extent, providing valuable preparation time for other regions to prevent the epidemic. In the face of the unknown, sudden, and fierce natural disasters of the epidemic, the Chinese government decisively launched a battle of epidemic prevention and control and mobilized all necessary resources throughout the country to support virus control in Hubei Province and Wuhan. Leishenshan Hospital and Huoshenshan Hospital, which were used to focus on the treatment of COVID-19 patients, had 1500 beds and 100 beds, respectively. These two hospitals took only about ten days to realize the whole process, from the completion of the design plan to delivery. Cities in Hubei Province received paring assistance from other provinces [9]. As of 14 February 2020, more than 25,000 medical professionals rushed to Wuhan. In addition, Wuhan successfully established 16 Fangcang shelter hospitals, which eased the great pressure on the medical system [22]. Meanwhile, more than 40 designated hospitals were set up in Wuhan, mainly to treat severe COVID-19 patients [23]. During the observation period, the real-time fatality rate in mainland China (excluding Hubei Province) was basically stable at a low level. With the continuous alleviation of the pressure of medical resources in Wuhan and Hubei Province, their real-time fatality rate generally showed a continuous downward trend. Finally, by late February 2020, the real-time fatality rates in the three regions were no longer significantly different, and the level remained relatively low in March 2020.

Empirical Analysis Result

In this section, the proposed permutation test is applied to the analysis of variance to test the degree of difference in real-time fatality rates over time among different clusters. The unit is a day, which means there are T = 60 time intervals, and the 0th day ( T = 0 ) is 31 January 2020. A three-sample test is conducted on H 0 :
H 0 : p China ( t ) = p Hubei ( t ) = p Wuhan ( t ) , 0 < t T
and we set the overall significance level α to 0.05.
Specifically, during the 60-day observation period, the test statistic F 0 of the original sample is first calculated. Then, we mix the three original samples, randomly assign the original samples into three groups to obtain the permutation samples, calculate the test statistic F i of the permutation samples, and repeat this n ! times. Finally, F i is arranged in order, making the distribution of F i the empirical distribution. The decision is made by comparing F 0 with the empirical distribution. According to Table 2, there are generally significant differences in the real-time fatality rates of the three regions during the 60-day observation period. The multiple comparison test results in Figure 3 show that the confidence interval of the mean difference in real-time fatality rate between two or two of the three regions does not contain the 0 point, which shows significant difference in disease severity among the three regions during the whole observation period. According to Table 3, we can see that the real-time fatality rate in Hubei is 0.0532 higher than in China on average, while the real-time fatality rate in Wuhan is 0.1010 higher than in China on average, and 0.0478 higher than in Hubei.
Further, in order to measure the effectiveness of the intervention measures implemented by the Chinese government and the impact of improved medical resources on disease severity, we conduct a temporal examination of the real-time fatality rates in Hubei and Wuhan before and after assistance. First, the proposed method is performed for the period from 31 January 2020 to 26 February 2020, when T = 27 . According to the results in Figure 4, before the aid to Hubei and Wuhan and at the initial stage of the aid, the confidence interval of the mean difference of the real-time death rate between the three regions does not contain the 0 point, suggesting that there are still significant differences in the real-time fatality rate between the three regions.
In the late assistance period, the same inspection procedure is carried out from 27 February 2020 to 30 March 2020, when T = 33 . As shown in Figure 5, during this period, the confidence interval of the mean difference in the real-time fatality rate in the three regions contains the 0 point, suggesting that there is no significant difference in the real-time fatality rate in the three regions. It can be considered that the accessibility of medical resources in different regions is roughly the same, and the intervention policies implemented by the Chinese government have effectively alleviated the disease severity. The result shown in Figure 5 further confirms similar real-time fatality rates in the three regions from late February 2020 to late March 2020.

7. Conclusions

In this paper, a non-parametric test procedure is proposed. The non-parametric test method generally does not involve population parameters, and its assumptions are much less than those of a parametric test. Therefore, nonparametric tests are used to compare differences in real-time fatality rates among independent groups during epidemic outbreaks. Since implementing effective interventions can inhibit the spread of the disease to some extent, thereby reducing the disease severity and saving more lives, our approach can validate the effectiveness of interventions. The asymptotic distribution of the test statistic under the null hypothesis allows analysis of variance to be performed using the permutation test, and the null hypothesis can be rejected when there are significant differences in disease severity among groups. At the same time, our simulations show that the proposed test can accurately identify inter-group differences in disease severity for various scenarios. The test procedure proposed in this paper is applied to the data of the initial stage of the COVID-19 epidemic in mainland China to test the difference in the severity of the disease among the three independent clusters. The results show that there is a significant difference in the real-time fatality rate of the three clusters during the whole observation period. Through the discussion of time segments, we found that with the implementation of effective intervention measures and the improvement in medical resources, there is no significant difference in the real-time fatality rate of the three clusters in the later stage of observation. We can see that the severity of COVID-19 in mainland China may be related to the implementation of interventions and the availability of medical resources, which reflects the important role of effective interventions and medical resources in reducing the real-time fatality rate.
Our method can be applied to the case of k samples. In essence, this allows researchers to study more clinical questions in concrete real-life situations, such as investigating the extent of differences in the real-time fatality rate between multiple sex groups or age groups, which may help study other factors that influence the real-time fatality rate of diseases. It is important to note that different treatment regimens are used clinically for different groups, and investigating the extent to which mortality varies between treatment groups can help healthcare providers determine the most effective treatment regimens. The method proposed in this paper can be one of the important methods to evaluate the efficacy of different treatment regimens. The test procedure is based on the information contained in the observed sample and is not limited by the population distribution. Surveillance data from authorities during outbreaks is always incomplete, with some data such as gender and age being difficult to obtain, especially in areas with low awareness of protection and inadequate healthcare systems. Despite the post-pandemic period, detailed data on the COVID-19 are still difficult to obtain, and for most regions, only daily confirmed cases, deaths, and recoveries are recorded and aggregated. It is most important for the health department to use the simplest data structures to gain a deeper understanding of the disease so that rapid interventions can be taken to curb the spread of the disease early and reduce mortality.

Author Contributions

Methodology, M.M.; Formal analysis, J.F.; Data curation, C.Z. All authors have read and agreed to the published version of the manuscript.

Funding

Project supported by the Fundamental Research Funds for the Central Universities, China University of Geosciences (Wuhan), under Grants G1323523061.

Institutional Review Board Statement

The study was approved by the Institutional Research Ethics Committee of China University of Geosciences, Wuhan.

Data Availability Statement

The datasets on the daily confirmed cases, deaths and recoveries for each cluster from the official website of the National Health Commission (http://www.nhc.gov.cn, accessed on 11 October 2020).

Acknowledgments

The authors are grateful to three anonymous referees for their valuable comments, which have improved the presentation of this work.

Conflicts of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  1. Park, K. Park’s Textbook of Preventive and Social Medicine INDIA; Banarsidas Bhanot Press: Jabalpur, India, 2007. [Google Scholar]
  2. Madhav, N.; Oppenheim, B.; Gallivan, M.; Mulembakani, P.; Rubin, E.; Wolfe, N. Pandemics: Risks, Impacts, and Mitigation; Department of Global Health, University of Washington: Seattle, WA, USA, 2018. [Google Scholar]
  3. World Health Organisation. Coronavirus Disease Weekly Epidemiological Update and Weekly Operational Update. 2023. Available online: https://www.who.int/ (accessed on 1 January 2023).
  4. World Health Organisation. Estimating Mortality from COVID-19. Available online: https://www.who.int/publiccations/i/item/WHO-2019-nCoV-Sci-Brief-Mortality-2020.1 (accessed on 23 October 2022).
  5. Yip, P.S.; Lau, E.H.; Lam, K.F.; Huggins, R.M. A Chain Multinomial Model for Estimating the Real-Time Fatality Rate of a Disease, with an Application to Severe Acute Respiratory Syndrome. Am. J. Epidemiol. 2005, 161, 700–706. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  6. Last, J.M. A Dictionary of Epidemiology, 3rd ed.; Oxford University Press: New York, NY, USA, 1995. [Google Scholar]
  7. Lam, K.F.; Deshpande, J.V.; Lau, E.H.Y.; Naik-Nimbalkar, U.V.; Yip, P.S.F.; Xu, Y. A test for constant fatality rate of an emerging epidemic: With applications to severe acute respiratory syndrome in Hong Kong and Beijing. Biometrics 2008, 64, 869–876. [Google Scholar] [CrossRef] [PubMed]
  8. Ambreen, K.; Faisal, M.; Ayesha, A. Investigating the roles of meteorological factors in COVID-19 tranmission in Northern Italy. Environ. Sci. Pollut. Res. 2021, 28, 48459–48470. [Google Scholar]
  9. Chen, T.; Wang, Y.; Hua, L. Pairing assistance the effective way to solve the breakdown of health services system caused by COVID-19 pandemic. Int. J. Equity Health 2020, 19, 68. [Google Scholar] [CrossRef] [PubMed]
  10. Reich, N.G.; Lessler, J.; Cummings, D.A. Estimating absolute and relative case fatality ratios from infectious disease surveillance data. Biometrics 2012, 68, 598–606. [Google Scholar] [CrossRef] [PubMed]
  11. Chen, Z.; Akazawa, K.; Nakamura, T. Estimating the case fatality rate using a constant cure-death hazard ratio. Lifetime Data Anal. 2009, 15, 316–329. [Google Scholar] [CrossRef] [PubMed]
  12. Yip, P.S.; Lam, K.F.; Lau, E.H.; Chau, P.H.; Tsang, K.W.; Chao, A. A comparison study of real-time fatality rates: Severe acute respiratory syndrome in Hong Kong, Singapore, Taiwan, Toronto and Beijing, China. J. R. Stat. Soc. 2005, 168, 233–243. [Google Scholar] [CrossRef] [PubMed]
  13. Yuanke, Q.; Chun, Y.L.; Lam, K.F. A sequential test to compare the real-time fatality rates of a disease among multiple groups with an application to COVID-19 data. Stat. Methods Med. Res. 2022, 31, 348–360. [Google Scholar]
  14. Yang, X.P.; Yang, L.P. Summary analysis of case reports of pulmonary fibrosis caused by COVID-19. Chin. J. Hosp. Pharm. 2023, 43, 663–668. [Google Scholar]
  15. Mao, S.S.; Zhou, J.X.; Chen, Y. Experimental Design, 3rd ed.; China Statistics Press: Beijing, China, 2020. [Google Scholar]
  16. Hapfelmeier, A.; Hornung, R.; Haller, B. Efficient permutation testing of variable importance measures by the example of random forests. Comput. Stat. Data Anal. 2023, 181, 107689. [Google Scholar] [CrossRef]
  17. Xue, W. Statistical Analysis and Data Mining Based on R; Renmin University of China Press: Beijing, China, 2014. [Google Scholar]
  18. Jiang, F.; Zhao, Z.; Shao, X. Time series of COVID-19 infection curve: A change-point perspective. J. Econom. 2023, 232, 1–17. [Google Scholar] [CrossRef] [PubMed]
  19. Ellyssa, M.; Brereton, L. Dialysis patient experiences during the COVID-19 pandemic: A survey study. Kidney Med. 2023, 5, 100673. [Google Scholar]
  20. Kraemer, M.U.; Yang, C.H.; Gutierrez, B.; Wu, C.H.; Klein, B.; Pigott, D.M. The effect of human mobility and control measures on the COVID-19 epidemic in China. Science 2020, 368, 493–497. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  21. National Health Commission of the People’s Republic of China. Report of Epidemic Situation. 2023. Available online: http://www.nhc.gov.cn (accessed on 29 February 2020).
  22. Wang, X.; Fang, J.; Zhu, Y.; Chen, L.; Ding, F.; Zhou, R.; Ge, L.; Wang, F.; Chen, Q.; Zhang, Y.; et al. Clinical characteristics of non-critically ill patients with novel coronavirus infection (COVID-19) in Fangcang Hospital. Clin. Microbiol. Infect. 2020, 26, 1063–1068. [Google Scholar] [CrossRef] [PubMed]
  23. Li, Q.; Wang, L.; Wang, B.; Lu, H. The COVID-19-designated hospitals in China: Preparing for public health emergencies. Emerg. Microbes Infect. 2021, 10, 998–1001. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Multiple comparison tests in the first four simulated scenarios.
Figure 1. Multiple comparison tests in the first four simulated scenarios.
Entropy 25 01028 g001
Figure 2. Changes in real-time fatality rate in different clusters during the observation period.
Figure 2. Changes in real-time fatality rate in different clusters during the observation period.
Entropy 25 01028 g002
Figure 3. Confidence intervals for mean differences in real-time fatality rate across the observation period.
Figure 3. Confidence intervals for mean differences in real-time fatality rate across the observation period.
Entropy 25 01028 g003
Figure 4. Confidence intervals for mean differences in real-time fatality rate among three regions before and at the beginning of the assistance.
Figure 4. Confidence intervals for mean differences in real-time fatality rate among three regions before and at the beginning of the assistance.
Entropy 25 01028 g004
Figure 5. Confidence intervals for mean differences in real-time fatality rate among three regions in the post-assistance period.
Figure 5. Confidence intervals for mean differences in real-time fatality rate among three regions in the post-assistance period.
Entropy 25 01028 g005
Table 1. Simulation results.
Table 1. Simulation results.
Scenario p 1 , D ( t ) p 1 , R ( t ) p 2 , D ( t ) p 2 , R ( t ) p 3 , D ( t ) p 3 , R ( t ) p-ValueMultiple Comparison
10.010.90.060.90.10.89≪0.00
20.010.90.020.890.10.8≪0.00C-A,C-B
30.010.740.050.890.030.75≪0.00B-A,C-A
40.010.740.040.80.020.75≪0.00B-A,C-A
50.020.800.020.80.030.8 0.0316 C-B
60.020.800.020.850.030.85 0.1312
70.020.800.040.80.020.9≪0.00B-A,C-B
80.020.600.060.740.020.82≪0.00B-A,C-B
90.050.70.020.630.02 0.6 ≪0.00B-A,C-B
100.050.650.050.670.040.7 0.672
110.050.60.10.740.04 0.7 ≪0.00B-A,C-B
120.050.820.030.740.250.74≪0.00
130.140.780.070.740.11≪0.00B-A,C-B
140.140.750.200.600.27 0.6 ≪0.00
150.14 0.7 0.17 0.75 0.220.74≪0.00C-A,C-B
160.14 0.7 0.08 0.75 0.270.61≪0.00
: there is a difference between any two in the groups ; −: there is no difference among all groups.
Table 2. Permutation test variance analysis.
Table 2. Permutation test variance analysis.
SourceDFSQMSQFp-Value
Region20.306380.1531913.25173<0.00000
Residuals1772.046640.01156
Total1792.35302
DF = degrees of freedom; SQ = sum of squares; MSQ = sum of mean squares; F = the test statistic.
Table 3. Multiple comparison test.
Table 3. Multiple comparison test.
RegionDiffLwrUprp-Value
Hubei–China0.05320.00680.09970.0200
Hubei–China0.10100.05460.14740.0000
Wuhan–Hubei0.04780.00130.09420.0421
Diff = mean difference; Lwr = lower confidence limit; Upr = upper confidence limit.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhou, C.; Fang, J.; Mao, M. A New Analysis of Real-Time Fatality Rate in the Initial Stage of COVID-19. Entropy 2023, 25, 1028. https://doi.org/10.3390/e25071028

AMA Style

Zhou C, Fang J, Mao M. A New Analysis of Real-Time Fatality Rate in the Initial Stage of COVID-19. Entropy. 2023; 25(7):1028. https://doi.org/10.3390/e25071028

Chicago/Turabian Style

Zhou, Chuanbo, Jiaohong Fang, and Mingzhi Mao. 2023. "A New Analysis of Real-Time Fatality Rate in the Initial Stage of COVID-19" Entropy 25, no. 7: 1028. https://doi.org/10.3390/e25071028

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop