Abstract-Sumedh Kaul - Larkin Community Hospital, South Miami, FL
Title: Modelling and Forecasting for Coronavirus Disease 2019 (COVID-19) spread in Miami-Dade County, Florida, USA, 2020.
Abstract:
Background: COVID-19 pandemic has taken a significant toll on the healthcare system as well as human population across the world. Researchers have made several attempts to understand the transmission pattern of the disease by building forecasting models. This study aims to build two to eight weeks of projections of cumulative cases and deaths; new cases and deaths per day in Miami-Dade County. The forecasting will give better estimation to model the spread of coronavirus cases and deaths and predict future points in the time series analysis.
Methods: The data for mapping projection graphs was used from the US counties dataset from nytimes website (https://github.com/nytimes/covid-19-data/blob/master/us-counties.csv). The statistical analyses were done in R version 3.6.3. The graphs were mapped using the Prophet Logistic Growth Model and Autoregressive Integrated Moving Average (ARIMA) Model. The Prophet’s logistic growth model performance was checked by running linear regression of predicted values given actual values. For the ARIMA model, the Akaike Information Criterion(AIC) method was used by running auto-regression which yielded the final model. The data prediction was started initially from cumulative confirmed cases from 11th March 2020 to 30th June 2020 and forecasting was done till 25th August 2020. The models were updated on a weekly basis to include more data points in order to bring more accuracy in the results.
Results: The model was last updated on 12th August 2020 and the Prophet’s logistic growth model predicts a cumulative number of 177,914 (95% CI: 175,253, 180,649) cases and a cumulative number of 2,220 (95% CI: 2,184, 2,258) deaths by August 25th, 2020. The r-squared value for both cumulative cases and deaths was 0.99 (p-value < 0.001). Similarly, ARIMA Model predicts a daily number of 1550 (95% CI: 226, 2874) cases and 31 deaths (95% CI: 8, 54). The results from the ARIMA model were described based on the lowest AIC of the model (AIC of daily number of cases: 2180 and AIC of daily number of deaths: 1155).
Conclusion: It was found that the actual and predicted cases and deaths were similar in trend till 11th August 2020. The model performance was also good to predict cases and deaths in Miami-Dade county. The model was unable to capture data from hospital records like infection rates and date of exposure to disease, as they are not publicly available which would have been important factors in predicting cases and deaths. The predicted cases and deaths change as the intervention changes so the graph has to be updated on a regular basis. Other appropriate methods like SEIRD (Susceptible, Exposed, Infectious, Recovered, Deceased) Model can be used which can work better with hospital records data, given publicly available, to predict the number of cases and deaths.