e-ISSN: 2582-5208 International Research Journal of Modernization in Engineering Technology and Science Volume:02/Issue:11/November -2020
Impact Factor- 5.354
www.irjmets.com
PREDICTIONS OF COVID-19 DEATHS IN INDIA, ITALY AND USA POST LOCKDOWN USING LINEAR REGRESSION ANALYSIS Sahana K S*1, Stavelin Abhinandithe K2, Dr.Madhu B3, Preethi R Bhat4, Dr. Balasubramanian S5 *1Research 2Assistant 3Associate 4M.Sc
Scholar, division of Medical Statistics, JSS AHER.
Professor, division of Medical Statistics, JSS AHER Professor, dept. of Community Medicine, JSS AHER
in Medical Statistics, Division of Medical Statistics, JSS AHER 5Dean
& Director Research, JSS AHER.
ABSTRACT The contagion of Covid-19(corona virus illness) generated by newly discovered corona virus, has generated a desolation in human advancement. Aforesaid study was focused on discover the tendency associated to death rate anticipated from Covid-19 by the end of May in India, USA and Italy. Method: confirm registry has been used to acquire Worldwide and Indian coronavirus statistics. Simple linear regression analyses, correlation analysis and residual analysis were used. Analysis was carried out using data analysis tool pack of Excel 2013. Result: Analyzing linear regression the mortality counts was found to be 2594, 28,538 and 63,607 in India, Italy and USA respectively with CI of 95%, and the prediction model was found appropriate from residual plot. Keywords: India, USA, Italy, Coronavirus, mortality counts, regression, correlation.
I.
INTRODUCTION
The epidemic of COVID-19 (Coronavirus illness) instigated by SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2) has been established a destruction on the society of man. After its appearance in the Chinese town of Wuhan (district of Hebei), it's been a perennial stride of fresh cases and loss of lives [9]. The unique mutation of the virus and its related unknown factors is what makes it more terrifying [2]. Local outbreak of COVID-19 was among the people exposed to the seafood market before the end of December 2019, due to the spread of the epidemic, the virus spread to communities through the early-infected people, forming community transmission. Interpersonal and cluster transmission occurred in multiple communities and families in Wuhan. The epidemic rapidly expanded and spread from Hubei Province to other parts of China due to the great mobility of personnel during the Chinese Lunar New Year, while the number of COVID-19 cases in other countries gradually increased. The plan of action was to avoid its advance by communal quarantine and a scientific protection mode to produce new fast examination equipment and medicines [1,3,7]. Coronavirus is a genus of RNA viruses in the Coronaviridae genus of viruses, Nidovirales order [10]. Coronaviruses are categorized onto three classes, based on the spikes created by the virus' various protein structures (spike, envelope & nucleocapside) [8]. The coronavirus SARS lies within category 2. Currently we get more than 26 million patented ostentatious with Covid-19 yielding far more than 870 thousand deaths worldwide.[4] In India, however as the statistics on 11 May 2020, we got approximately 59,000 confirmed cases with 1981 demise.[6] A number of nations along with India entered a condition of quarantine to keep the fatal disease from spreading. With latest quick-diagnostic tools entering including prosecutions with possibly obliging medicine, we require a greater grasp of the cycle of illness, and what it might bring in the coming months. For all data relating to the CoV-2 accessible from credible sources, we prefer to assimilate available data on overall rates of infection, overall fatalities, recovery statistics from around the world then allow statistical analyzes about what we can anticipate in India, USA and Italy in the future months.
www.irjmets.com
@International Research Journal of Modernization in Engineering, Technology and Science
[923]
e-ISSN: 2582-5208 International Research Journal of Modernization in Engineering Technology and Science Volume:02/Issue:11/November -2020
Impact Factor- 5.354
www.irjmets.com
Infectious disease transmission is a complicated diffusion process occurring in the crowd. Models can be established for this process to analyze and study the transmission process of infectious diseases theoretically, so that we can accurately predict the future development trend of infectious diseases.
II.
MATERIALS AND METHODS
The WHO COVID-19 situation report collected publicly available data and the Indian information was revised from the covid19india.org web site. Data gathered in a excel file format then correlation and regression analysis were applied using total cases, total death counts by the with the Data analysis tool pack of Excel 2013 software. Simple linear regression is the most widely known method of study when we analyze the relationship between a quantitative result and a single quantitative explanatory variable. For linear regression they typically have several unique explanatory variable values, so they generally conclude that the explanatory variables are always probable values between the observed values of the explanatory variables. They hypothesize a direct correlation between both the average of the target group and the predictive parameter value. If we just let Y have been some consequence, as well as x be another explaining parameter, then perhaps we can use the equation to represent their structural model. E(Y|x) = β0 + β1x In which E(), read as "expected value ," suggests the mean population; Y|x, that says "Y provided x," implies that where x is confined to certain single variable, they have been talking about potential values of Y; β0, perused "beta zero," is the variable for intercept; as well as β1, read "beta one" is just the component for slope. A common phrase is coefficient for every calculation of parameters or parameters shown in an equation for anticipating Y from x. The subscript "1" to β1 is often substituted by that of the explanatory variable initials or some acronym of same. Therefore the structural model suggests that with every measure of x its aggregate average of Y could be determined using simple linear equation β0 + β1x (across all objects that have the specific value "x" as their explanatory component). Furthermore, in exercise, maybe we just cannot actually make the calculation probably exactly, that's because the two variables are unidentified "nature mysteries." In practice, we make estimates of the parameters and substitute the estimates into the equation. [5] Inputs: Complete number of cases infected, active cases, stats for recovery. Outputs: total death counts, CFR (Case fatality rates).
III.
PRE ANALYSIS PHASE
The data source included one missing information (NA), and that was the US recovery details. From the light of the diverseness of data and the relevant outliers data falsification with mean was omitted from the results. A correlation study was performed as a retrieval technique (abandoning off the US records) using excel resulting a positive r= 0.6860 (P<0.001) was identified among the total number of cases affected and the recovery. Utilizing such stable combination and the equation provided by linear regression (Y i.e, recovery cases of US = b0 + b1* [Total cases of USA], with b0=8508.80 and b1=0.3454), the interest (75936) missing was extracted. Afterwards, the study was carried out. (Table 1) Table-1: Actual data containing all parameters relevant to coronavirus for March as well as April and the total mortality outputs for March and April, including the imputed value. Country
Total cases (March 31)
Active cases (March 31st)
Active cases ( April 30th)
Recovery cases (March 31)
March deaths
CFR
April deaths
India
1397
1239
24,641
123
35
2.505
1154
China
81554
2004
619
74881
3305
4.052
4633
Italy
105792
77635
101551
15729
12428
11.747
27967
USA
193353
180900
878843
75936
5151
2.664
60856
www.irjmets.com
@International Research Journal of Modernization in Engineering, Technology and Science
[924]
e-ISSN: 2582-5208 International Research Journal of Modernization in Engineering Technology and Science Volume:02/Issue:11/November -2020
Impact Factor- 5.354
IV.
www.irjmets.com
RESULTS
Analysis for May death number forecast: â&#x20AC;˘ Step1A correlation analysis was conducted to determine the existence and subsequent degree of interaction between both the output (April demise count) and the March and April inputs. A powerful correlation existed amongst the April deaths and all the input variables. (Table 2) Table-2: Analysis of association of the interaction among April deaths with each of the input factors. Total Cases (March 31) Total Cases (March 31) Active Cases (March 31st) Active Cases ( April 30th) Recovery Cases (March 31) March Deaths CFR April Deaths
Active Cases (March 31st)
Active Cases ( April 30th)
Recovery Cases (March 31)
March Deaths
CFR
April Deaths
1
0.686
1
0.494
0.9716
1
0.4786
-0.3104
-0.5263
1
0.8439
0.9692
0.8836
-0.0669
1
0.7859
0.989
0.9261
-0.1667
0.9949
1
0.7623
0.9938
0.9395
-0.2034
0.9905
0.9993
1
Step 2: simple linear regression analysis were conducted between April death count and all other input parameters which established the most significant input parameter That could be utilized to construct a model for the May fatality forecast in India, Italy and USA. The P-value was however only relevant for the Active cases of march input parameter. 138 (Table 3) Regression Statistics Multiple R
0.998874199
R Square
0.997749666
Adjusted R Square
0.996624499
Standard Error
1677.380144
Observations
4
ANOVA
Regression www.irjmets.com
df
SS
MS
F
Significance F
1
2494982997
2494982997
886.75694
0.001125801
@International Research Journal of Modernization in Engineering, Technology and Science
[925]
e-ISSN: 2582-5208 International Research Journal of Modernization in Engineering Technology and Science Volume:02/Issue:11/November -2020
Impact Factor- 5.354
Residual
2
5627208.298
Total
3
2500610205
www.irjmets.com
2813604.15
Table3: Outcomes of a linear regression review with April Mortality counts as output and Active Cases of March variables as input. Coefficients
Standard Error
t Stat
P-value
Lower 95%
Upper 95%
Intercept
2173.361852
1122.78179
1.93569389
0.1925429
-2657.57828
7004.301986
active cases (march 31st)
0.339663962
0.011406363
29.7784643
0.0011258
0.290586345
0.388741579
Goodness of fit (Adjusted R Square) in the above Simple linear regression demonstrates a powerful predictive power of a design. Nonetheless, almost all of the factors decline to demonstrate potential contribution relevance in model except Marchâ&#x20AC;&#x2122;s Active Cases parameter *
Step 3: Consequently, a simple regression analysis was considered to project the death tolls from the powerful input variable. The model was robust with R=0.99, R2=0.99 & adjusted R2=0.99, P<0.001, 95% CI. The total, optimum and mean deaths are counted based on the upper bound (peak) & lower bound (minimum) of the confidence intervals for May was computed-2 (Table 4). Thus the mortality of may counted in India, Italy and USA were 2594, 28538 and 63607 respectively, forecasted relying on usable data across the countries infected. (Table 4) The regression equation was used to find the predicted death count of the month May using the strongest input variable i.e., Active Cases of March. The Equation is as follows: Y= b0+ b1*(active cases) Death in May= 2173.36+ 0.3396*(active cases of March) Table 4: Prediction for May death toll in India, Italy and USA is built on the methodology of autoregression analysis. Countries
in 95% of confidence interval Mean point of estimation
India
Lower point of estimation Upper point of estimation Mean point of estimation
Italy
Lower point of estimation Upper point of estimation Mean point of estimation
USA
Lower point of estimation Upper point of estimation
www.irjmets.com
Intercept and coefficient b0 2173.36 b1 0.3396 b0 -2657.5783 b1 0.29058634 b0 7004.30199 b1 0.38874158 b0 2173.36 b1 0.3396 b0 -2657.5783 b1 0.29058634 b0 7004.30199 b1 0.38874158 b0 2173.36 b1 0.3396 b0 -2657.5783 b1 0.29058634 b0 7004.30199 b1 0.38874158
May's Predicted death 2,594 -2,297 7,485 28,538 19,856 37,181 63,607 49,803 77,320
@International Research Journal of Modernization in Engineering, Technology and Science
[926]
e-ISSN: 2582-5208 International Research Journal of Modernization in Engineering Technology and Science Volume:02/Issue:11/November -2020
Impact Factor- 5.354
www.irjmets.com
Step 4: The regression model that was used to predict may death count was again subjected to residual analysis to find whether the applied model is the best model for the given data. Table 5: Residual output. Observation
Predicted April deaths
Residuals
Standard Residuals
1
2872.55
-1718.55
-1.28
2
3120.14
1512.86
1.13
3
27597.92
369.08
0.27
4
61019.38
-163.38
-0.12
Figure 1: Residual plot of the input and output variable of the regression model considered for prediction. The above plots shows residuals on the vertical axis and the independent variable that was used for prediction (Marchâ&#x20AC;&#x2122;s active cases) on the horizontal axis. The points in the residual plot are randomly dispersed around the horizontal axis, which shows that the linear regression model was appropriate for the given data. Step 5: Comparison of predicted May death count with actual mortality in the month of May. Table 6: May death counts that were predicted from linear regression model are compared with the actual death count of May. Predicted May Death Count
Actual May deaths
Country
Lower point of estimation
mean estimation
upper point of estimation
India
-2,297
2,594
7,485
5,408
Italy
19,856
28,538
37,181
33,508
USA
49,803
63,607
77,320
108,907
From table 6 we can find that the actual deaths of May lie between the lower and upper limits of the predicted death counts of India and Italy but there was a drastic increase of mortality in USA. And it is still increasing exponentially. This has become the one of the leading cause of death in USA and all other countries that are affected by COVID-19. Relaxed social distancing mandates can be accounted mainly for this drastic increase in the death count. www.irjmets.com
@International Research Journal of Modernization in Engineering, Technology and Science
[927]
e-ISSN: 2582-5208 International Research Journal of Modernization in Engineering Technology and Science Volume:02/Issue:11/November -2020
Impact Factor- 5.354
www.irjmets.com
Discussion: India and rest of the countries are in the 10th month of COVID-19 epidemic. What awaits for the planet is the critical stage in which successful prevention strategies can avoid a future devastation faced by countries such as US, Brazil and India with both suffering and death rising exponentially. There has been an exponential rise in the number of infected cases from the 2nd month on, in the countries listed above. Various medical resources are becoming more abundant in the later period, the capabilities of medical personnel in various aspects are getting stronger, the support for various resources across the country is getting stronger, and the ability to refine management and treatment is getting stronger. These factors are likely to rapidly reduce the mortality rate of COVID-19. The above factors may also have some influence on the cumulative number of confirmed cases, but due to the large number of confirmed cases, the influence of these favorable human factors on the cumulative number of confirmed cases may be small. In addition, another concerned question is: When will the epidemic of the new coronavirus COVID19 end? Judging from the SARS situation in 2003, the date corresponding to the maximum number of cumulative diagnoses was basically the date when the epidemic ended. Limits to the analysis: The key limitation of this study is that it holds most input data for reflection without considering the logistic measures undertaken or not drawn during the phase. Indeed, the results of the end of the month are especially evocative of both the normal course linked to the virus and the reactions of the local government. Furthermore, restricting our study to only 3-4 affected nations may result in an exaggeration of the results. Study strengths: Despite all the drawbacks the greatest potential of this analysis was found in the prediction model with a very high modified R2.
V.
CONCLUSION
Accurate and timely infectious disease forecasts could inform public health responses to both seasonal epidemics and future pandemics by providing guidance for the utility, scale, and timing of prevention and mitigation strategies. Just about to go to the press, there have been roughly 26 million globally reported cases for coronavirus disease involving 874,193 deaths. [8] At the end of August India has 3,687,939 confirmed cases and about 65,435 deaths, United States has total of 6,207,546 confirmed cases with 187,710 deaths [9]. Luckily China has reduced amount of people diseased by the end of April. Lack of social distancing, unrestricted vehicle movement, private offices were allowed to operate post lockdown, changing weather patterns and community spread can be accounted for this exponential growth of death count.
VI. [1]
[2]
[3]
[4]
REFERENCES
Armitage R, Nellums LB. COVID-19 and the consequences of isolating the elderly. The Lancet. 233 2020. [Online] Available at: https://www.thelancet.com/action/showPdf?pii=S2468-234 2667%2820%2930061-X Cascella M, Rajnik M, Cuomo A, Dulebohn SC, Napoli RD. Features, Evaluation and Treatment 229 Coronavirus (COVID-19). 2020. [Online] Available on: 230 https://www.ncbi.nlm.nih.gov/books/NBK554776/ Coronavirus disease (COVID-19) technical guidance: Laboratory testing for 2019-nCoV in humans. 237 World Heath Organisation. 2020. [Online] Available at: 238 https://www.who.int/emergencies/diseases/novel-coronavirus-2019/technicalguidance/laboratory-239 guidance Coronavirus disease (COVID-2019) situation reports. World Health Organization. [Online] 258 Available at: https://www.who.int/emergencies/diseases/novel-coronavirus-2019/situationreports/
www.irjmets.com
@International Research Journal of Modernization in Engineering, Technology and Science
[928]
e-ISSN: 2582-5208 International Research Journal of Modernization in Engineering Technology and Science Volume:02/Issue:11/November -2020
Impact Factor- 5.354
www.irjmets.com
[5]
http://www.google.com/url?sa=t&source=web&rct=j&url=http://www.stat.cmu.edu/~hseltman/ 309/Book/chapter9.pdf&ved=2ahUKEwikrOWWn6bpAhXlxTgGHZzWDkgQFjAbegQIARAB&usg=A OvVaw3av0SsziZpo%yWHmVSNT8g [6] INDIA COVID-19 TRACKER. 2020. [Online] Available at: https://www.covid19india.org/ [7] Information for Clinicians on Therapeutic Options for COVID-19 Patients. Centres for Disease 242 Control and Prevention. 2020. [Online] Available at: https://www.cdc.gov/coronavirus/2019-243 ncov/hcp/therapeutic-options.html [8] Peiris JSM. Coronaviruses. 2012. Chapter 35. Medical Microbiology (Eighteenth Edition). © 2012 249 Elsevier Ltd. [9] Remuzzi A, Remuzzi G. COVID-19 and Italy: what next? The Lancet 2020. [Online] Available at: 226 https://www.thelancet.com/action/showPdf?pii=S0140-6736%2820%2930627-9 [10] The species severe acute respiratory syndrome related coronavirus: classifying 2019-nCoV and 246 naming it SARS-CoV-2. Corona viridae Study Group of the International Committee on Taxonomy of 247 Viruses. Nature Microbiology 2020; 5: 536-544.
www.irjmets.com
@International Research Journal of Modernization in Engineering, Technology and Science
[929]