PREDICTION ANALYSIS OF COVID-19 CASES IN INDIA USING PROPHET MODEL

blog 1

SHARE

ABSTRACT

Currently, the entire world is battling the most threatening disease (COVID-19) which was first reported in the Chinese city of Wuhan. As the outbreak started to infect the huge number of people outside China, WHO declared this to be pandemic due to the continuous rise in positive cases and deaths. Scientists are working relentlessly in developing a vaccine that would protect millions of individuals from this lethal virus. Many countries are attempting to combat this virus infection in different ways. India mandated a nationwide strict lockdown from 25th March 2020 to contain the spread of the virus.

In this blog, coronavirus data and Google mobility data of three Indian states (Kerala, Maharashtra & Tamil Nadu) were taken into study to understand the effect of lockdown and public movement during COVID-19 and the outcome of effective prevention strategies which helped in overturning the predicted results. The first half of the blog includes the analysis of lockdown versus the corona infection in three states of India by leveraging the Google Mobility data against the number of confirmed cases data. The second half discusses the forecasting model using Facebook’s Prophet algorithm to understand the number of COVID-19 cases and the number of deaths extensively.

30
30

WHAT IS THE SITUATION OF COVID-19 IN INDIA?

As on May 20th, 2020, more than 1 lakh positive COVID-19 cases were reported in India after the first confirmed case was recorded in Kerala on 30th January 2020. It is evident from the graph that more than half of the total cases in India come from the top four states – Maharashtra, Gujarat, Tamil Nadu, and Delhi. Although Kerala was the first affected state in India, the state registered for around 560 cases in the third week of May and was far below from the severely affected states. 

The Indian government ordered a nationwide lockdown for 21 days as the number of confirmed cases reached approximately 500. During this phase, the country observed a gradual decrease in the growth rate of infections. But, as we can see from the graph, the cases started to surge when people started to step out of the house and were not cautious about the social distancing measures. During the second and third phases of lockdown, the entire country was classified as red, orange, and green zones depending on the spread of the virus. Since the essential retail shops, private offices were reopened and inter-state travel restrictions were lifted, the cases were increasing rapidly regardless of lockdown remains in effect.

LOCKDOWN VS CORONA IN TAMILNADU

30.0 1
30.0 1
30.1
30.1

The first case was reported on March 7th, 2020 in Tamil Nadu. Public movement spiked on the previous day of the lockdown & dipped drastically during the lockdown period. During the first lockdown, only medical shops and grocery markets were allowed to function. However, the cases started to rise when people started to step out for essential items and travelling to hometowns. A vigorous lockdown was imposed for the top five red zoned districts (Chennai, Coimbatore, Tiruppur, Salem & Madurai) for four days from 26th to 29th April 2020. Due to short term notice about the intense lockdown, there was a surge in public because of which cases kept on increasing irrespective of lockdown still in place.

LOCKDOWN VS CORONA IN MAHARASHTRA

30.2 1
30.2 1
30.5
30.5

Maharashtra registered its first confirmed case on 9th March 2020. Consistently, Maharashtra recorded the sharpest single-day increase of confirmed cases in the nation every time. Similar to the ups and downs of numbers observed in Tamil Nadu, there was a spike in public movement a day before the commencement of lockdown followed by a drastic fall in public movement. Since Mumbai, Pune & Thane are known for their highly dense population, these three places were the highly affected areas in Maharashtra. 

LOCKDOWN VS CORONA IN KERALA

Kerala declared a state calamity after registering three cases in January. New cases were not reported in the state for a month and hence the emergency was withdrawn. Unfortunately, Kerala started reporting new positive cases from 9th March 2020. Similar to Tamil Nadu and Maharashtra, there was an instant push up in the public movement whenever lockdown dates were announced. Before the announcement of 21 days lockdown by the Indian government, the state government ordered for complete lockdown till the end of March. Hence, a spike in the public movement was observed again on 23rd March 2020 in Grocery and Pharmacy chains. As the phase-1 of lockdown ended, Kerala recorded around 350 positive cases. The health ministry of Kerala devised smart strategies for effective contact tracing and isolating the specifics for 28 days instead of 14 days as insisted by the National health council. Therefore, this state was able to decrease the growth rate of infection by the end of the second lockdown substantially.

PRELIMINARY TIME SERIES FORECASTING USING FACEBOOK PROPHET

Time series forecasting is an efficient method to predict future values based on previously observed data. The most commonly used time series forecasting models are Linear Regression, Exponential Smoothing, and Auto-Regressive Integrated Moving Average (ARIMA). Although linear regression and exponential smoothing techniques can handle several time series components, these models are sensitive while handling the outliers and having narrow confidence intervals. ARIMA model combines autoregression and moving average techniques with realistic confidence intervals. But this model requires huge volumes of data for predicting future values. 

Prophet algorithm is developed and open-sourced by the Facebook AI team for time series forecasting/ predictive analysis. Prophet works with the underlying principle of an additive regression model. The prophet model is flexible in tuning the parameters and is accurate with standard configurations. It is known for handling outliers very well. It is specialized to forecast based on hourly, daily, and weekly observations with only a few months of history. The mathematical logic behind Prophet Model comprises a decomposable time series model with three main components namely trend, seasonality, and holidays along with noise/error as the fourth additional component. As it follows the principle of additive regression, Prophet fits well for both linear and non-linear functions of time as components.

30.6 1
30.6 1

PREDICTED VS ACTUAL CONFIRMED CASES USING FACEBOOK PROPHET

30.7
30.7

PREDICTED VS ACTUAL DEATH CASES USING FACEBOOK PROPHET

In this blog the Prophet model was used to predict the number of confirmed cases and death cases for these three states in India. The input datasets containing the number of confirmed cases and death cases are stored into separate python data frame objects. These data frame objects are then formatted in such a way having two columns ‘ds’ which holds the date column and ‘y’ holds the value to be forecasted. Upon fitting the data frame into the model, a new data frame is built for forecasting the future data by setting the future period in the ‘make_future_dataframe’ method. Here, the future period for predicting both the confirmed and the death cases was set to 14 days. The width of the interval is set to 0.99. The predicted value is set against each row of these data frames and these output values are represented as ‘yhat’, in the range of lower (‘yhat_lower’) and upper (‘yhat_upper’) confidence intervals. Although the mean value might appear similar, the variation can be changed by increasing the confidence interval which in turn is achieved by tweaking the interval width. Finally, an interactive plot is created using Plotly showcasing both the actual and the forecasted values. 

As observed from the above graphs, the forecasted values by the Prophet model for the confirmed cases and death cases were aligned with the actual values for Tamil Nadu and Maharashtra. Even though the Prophet model had predicted more numbers compared to the actual values of both confirmed and deaths cases in Kerala, the Kerala government was  able to out beat the forecasted values and reduce the spread of infection and deaths due to their stringent containment strategies. 

CONCLUSION

In this blog, the effect of lockdown in flattening the curve has been captured for Tamil Nadu, Maharashtra and Kerala. Kerala was successful in containing the virus compared to the other states due to diligent health workforce which prioritized early detection through test and trace model. The other states must also follow the footsteps of Kerala which succeeded in overturning the predicted values in terms of confirmed cases and death cases as highlighted in this blog. Effective exit strategies in a phased manner must be formulated before lifting the restrictions without compromising on the health infrastructure. Also, Social distancing must be practiced to break the chain of transmission. 

REFERENCES

Coronavirus cases data was downloaded from the Kaggle COVID data repository. Mobility data was downloaded from Google’s Mobility data repository (Google LLC “Google COVID-19 Community Mobility Reports” – Accessed: <21-05-2020>). Data exploration and graph plotting were developed using Python and Google Colab. Facebook’s Prophet package was used for modeling.

Related Blogs

The Business Case for Integration Integrating various functions within a business can unlock significant efficiencies and…

Effective data management and governance are crucial for organizations aiming to maximize the value of their…

Close to 89% of businesses face challenges with data integration. Nearly 40% of projects fail due…

Scroll to Top