The Schedule
Day 1
January 28, 2021 All times in ET
Plenary Talk
Forrest Crawford, Yale University (COVID-19 transmission models in the real world:
models, data, and public policy)
9:45 - 10:45
COVID-19 transmission models in the real world: models, data, and public policy
Chair: Yulia R. Gel
Session 1: COVID Data
10:45 - 11:45
Chair: Alexander Volfovsky
10:45 - 11:00
Manar Alkuzweny, Scripps Research
Improving standardization and accessibility of epidemiological data with the ‘outbreakinfo’ R package
Epidemiological data on the COVID-19 pandemic is available from a variety of sources in an array of formats. The problems associated with the lack of standardization of this data has led to the development of outbreak.info, which combines data from multiple sources to provide a searchable web interface that allows users to explore visualizations of spatiotemporal data on COVID-19 cases, hospitalizations, testing, and deaths worldwide. To increase accessibility of this data, we have developed an R package complement to outbreak.info, ‘outbreakinfo’, that accesses the raw data used to generate these visualizations via the API. This package allows users to easily pull data directly into R for downstream analysis and visualization. Users can retrieve data by specifying administrative level (World Bank region, country, state/province, metropolitan area, and/or county) or by constructing a custom query with additional parameters. In addition, the package allows users to directly plot metrics of interest for specified locations. The data also includes geometric features, allowing users to create spatial visualizations of epidemiological data. Finally, additional functionality is being developed to allow users to detect peaks in epidemiological curves for specified locations.
11:00 - 11:15
Fabian Santiago, University of California, Merced
Global Sensitivity Analysis of a Mathematical Model of COVID-19 Transmission Dynamics in a University Setting
In March 2020 the University of California, Merced (UC Merced), along with other universities throughout the United States moved to an on-line only mode of course delivery to decrease the spread of SARS-CoV-2, the virus responsible for the COVID-19 disease. During that time the UC Merced leadership focused on how to safely bring students back to the campus in the Fall. At UC Merced this involved using mathematical models to evaluate the effectiveness of proposed mitigation strategies for containing the spread of COVID-19 within the university setting. In this talk, I will discuss the mathematical model we used to evaluate Fall 2020 re-opening strategies, a system of ordinary differential equations, and present a global sensitivity analysis of the contact and infection model parameters that govern the transmission dynamics of COVID-19 within the university setting.
11:15 - 11:30
Yusuf Afolabi, University of Louisiana at Lafayette
How Efficient is Contact Tracing in Mitigating the Spread of Covid-19? A Mathematical Modeling Approach
Contact Tracing (CT) is one of the measures taken by government and health officials to mitigate the spread of the novel coronavirus. In this talk, we investigate its efficacy by developing a compartmental model for assessing its impact on mitigating the spread of the virus.
11:30 - 11:45
Peter Cho, Duke University
Inequities in Representation in COVID-19 Studies As A Result of Bring-Your-Own-Device Study Model
"Peter Cho, Ryan Shaw, Jessilyn Dunn
Researchers in earlier studies have collected digital biomarkers from patients in a clinically controlled setting, where the participants are a preselected cohort and medical devices for digital biomarker collection were often given. However, with the advancement in recruitment methods, there is a surge of studies where data is collected by a Bring-Your-Own-Device (BYOD) approach, where participants voluntarily sign up for the study and provide their consumer mobile health data over a course of time. With the recent COVID-19 pandemic, BYOD studies provide a rapid response to collect data relevant to COVID-19 detection and biomarker discovery. However, imbalances in data often arise in BYOD studies due to the technological, socio-cultural, and economical barriers that can cause less representation in certain demographics of BYOD studies.
In our own COVID-19 study, CovIdentify, we noticed the imbalance in underrepresented groups that did not align with the national percentages. People who are Black/African American or Hispanic/Latino have lower representation in the study compared to the general population. Blacks/African Americans make up 13.4% of the U.S. population, but only 3.6% of our study. Hispanics/Latinos make up 18.5% of the U.S. population, but only 4% of our study. Since this inequity in representation affects the data quality and can result in a biased COVID-19 detection algorithm, we sought to address the economical, technological, and sociocultural barriers that may limit participation from underrepresented groups. We have presented to local community groups and altered our social media advertisement strategy. By continuing to improve our outreach efforts, we plan to design a COVID-19 detection model that can accurately screen a diverse demographic. "
Session 2: Forecasting
11:45 - 12:45
Chair: Ignacio Segovia Dominguez
11:45 - 12:00
Forecasting of COVID-19 daily report data has been one of the several challenges posed on the governments and health sectors on a global scale. To facilitate informed public health decision making for the coming weeks, the concerned parties rely on short term daily and weekly projections generated via predictive modeling. Although, several models have been proposed in the literature, there has been a growing debate amongst researchers over model performance evaluation and finding the best model appropriate for a certain feature (cases, deaths, hospitalizations, etc.), a particular regional level (county, state, country, etc.) and more. We calibrate stochastic variants of six different growth models (i.e. logistic, generalized Logistic, Richards, generalized Richards, Bertalanffy, and Gompertz) and the basic SIR model into one flexible Bayesian modeling framework. We perform time-series cross-validation to compare prediction error metrics of the methods considered. In terms of regions, we consider the 50 states of US and Washington DC for a state-level analysis and the top 20 countries in terms of case counts for a country-level analysis. After fitting the models, we visualize the mean absolute percentage error (MAPE) and its symmetric version of the model forecasts. In general, as the models learned more and more data, the predictive performance improved drastically considering all the regions. Finally, to find the best model for a certain state or country, we average the prediction error metric over the entire period and compare the results. In conclusion, it was noted that none of the models proved to be golden standards across all the regions in their entirety.
12:00 - 12:15
"The COVID-19 pandemic has caused over 67 million cases with over a million deaths globally and of these numbers, over 2.5 million cases and over 55 thousand deaths have occurred on the African continent thus far. It appears that mankind is losing the battle to contain the disease. Therefore, prevention and surveillance will be the cornerstone of interventions left to halt further spread of the pandemic. Google Health Trends (GHT), a free internet tool, may be valuable to
help anticipate outbreaks, or identify disease hotspots, or better understand the patterns of disease surveillance. In this research, our objective was to use GHT to anticipate and characterize COVID-19 incidence across Africa. To pursue this goal, we collected the number of searches from the GHT data-source related to COVID-19 using four terms, namely coronavirus, coronavirus symptoms, COVID19, and pandemic. The terms were correlated to weekly COVID-19 case incidences for 54 countries in Africa at the national level, between February 2nd to August 2nd, 2020 via a multiple linear regression analysis. We also collected 70 other variables
relating to demographics, economics, internet accessibility, and disease burden, among other predictors for each country, to characterize the mechanisms that might explain the relationship between GHT searches and COVID-19 incidence. Average death incidences were also calculated over 3 different time periods as a measure of which countries were more affected during the interval studied. Our study shows that GHT lacked the predictive power to anticipate COVID-19
cases in most African countries, and its lack of applicability was not explained by any of the analyzed indicators when normally distributed, nor by using log-adjusted method. It was only able to characterize COVID-19 incidence in Tanzania, Tunisia, and Burkina Faso out of the 54 countries analyzed; our analysis yielded an adjusted R^2 value greater than 0.6 for these countries. We further observed that GHT may best be used during the early stages of an outbreak when infection growth coincides with high search volume for disease specific terms as seen in
these three countries. Furthermore, GHT performed poorly in countries that did not have rapid infection growth until much later on as observed in the other countries."
12:15 - 12:30
COVID-19 disease became a pandemic in just a few months after it was first detected in China. Ecuador has reported one of the highest rates of COVID-19 in Latin America, with more than 62,000 cases and 8,500 deaths in a country of approximately 17 million people. The dynamics of the outbreak has being observed quite different depending on the province being analyzed with high reported prevalence in some low population density provinces. In this study, we aim to understand the variations in outbreaks between provinces and provide assistance in essential preparedness planning in order to respond effectively to ongoing COVID-19 outbreak. In this study, we have estimated the critical level of quarantine rate along with corresponding leakage in order to avoid overwhelming the local health care system. The results suggest that provinces with high population density can avoid a large disease burden provided they initiate early and stricter quarantine measures even under low isolation rate. To best of our knowledge, this study is first from the region to determine which provinces will need much preparation for current outbreak in fall and which might need more help.
12:30 - 12:45
Manuela Runge, Northwestern University
Predicting intensive care unit occupancy and thresholds for action to avoid exceeding capacities in Chicago, Illinois
SARS-CoV2 continues to spread across the US and threatens quality care of severely ill COVID-19 patients in intensive care units (ICU), many operating near capacity limits. The aim of this modelling study was to identify occupancy thresholds at which current mitigation measures should be intensified to prevent an ICU overflow. We extended a stochastic SEIR model for SARS-CoV2 with compartments for symptom status, hospitalization, ICU and deaths. We calibrated the model for Chicago, Illinois, using local data on ICU census, hospitalizations and deaths until the end of August, and assumed two transmission increase scenarios between September and December 2020. In the simulations, intensified mitigation resulting in various levels of decrease in transmission was triggered when the ICU occupancy reached a pre-defined fraction of the capacity. We found that even small changes in transmission could lead to an overflow of the ICU capacity. If transmission increase was high, a threshold of 80% was insufficient to prevent an overflow, even at high mitigation effectiveness. At a lower level of transmission increase, a threshold of 80% prevented an overflow if mitigation would decrease the reproductive number to less than one (60-80% of the simulations below capacity), whereas a 20-40% probability of an overflow remained depending on the mitigation effectiveness and uncertainty in model parameters. The model results suggest that a threshold of less than 60% is needed to reliably prevent ICU overflow and that transmission growth, mitigations strengths and anticipated delay in mitigation effect are important factors for selecting appropriate thresholds.
Session 3: COVID Modelling: Mobility and Mitigation
12:45 - 1:45
Chair: Folashade Agusto
12:45 - 1:00
"With the onset of SARS-CoV-2 and the resulting shelter in place guidelines, human mobility in 2020 has been greatly impacted. Using a variety of high-resolution datasets, studies have sought to measure and explain changes in mobility to better understand and anticipate the spread of the disease. Existing studies typically use a static baseline, representing mobility before the onset of the pandemic, to determine whether mobility increases or decreases at specific points in time for specific study areas and relates these changes to certain pandemic and policy events. However, more comprehensive analysis of mobility change over time is needed. This study aims to identify and describe common temporal trends in mobility in the US at the county level. We first dynamically measured changes in mobility for each county by comparing a 7-day rolling average of minutes spent away from home in 2020 to the same measure in 2019. We then used principal component analysis to reduce the dimensionality of the data into three latent components where each component is explained by a time series representing change in mobility from January 2020-December 31st 2020. The first and strongest component corresponds to a vastly reduced mobility after the start of the pandemic, the second component corresponds to seasonal mobility ignoring the pandemic, and the third component describes initial reduction of mobility with gradual return to normality. By describing each county as a linear combination of these three components, we are able to explain 57% of the variation in mobility trends across all counties. Next, we used k-means clustering analysis to find which counties have similar weighted combinations of the three principal components. Finally, local and global spatial autocorrelation was calculated for each of these principal components and each was found to be significantly autocorrelated. Results indicate that mobility trends in 2020 for each county can be explained as a combination of three mobility trends and that US counties that are geographically close are more likely to exhibit similar trends."
1:00 - 1:15
Mobility is one of the potential factors contributing to the spread of COVID-19. Our goal has two folds. One is to estimate the effect of state-wide policies on mobility signals at a county-level. We do so by using a regression discontinuity design to compare the difference in mobility before and after interventions. Based on assumptions, we have found that state-wide emergency declaration tends to be more effective in the areas with a large population, a small percentage of people in poverty, a high percent of people with education backgrounds, and a low unemployment rate. Secondly, recommended policies, gathering restrictions, and school closure, may not have a significant effect in reducing mobility, whereas some mandatory policies such as public masks and business closures have a significant effect on mobility in California counties. We rank the effectiveness of the intervention on mobility in terms of the regression coefficient in a linear regression setting.
1:15 - 1:30
The novel infectious disease (COVID-19) has become a pandemic in just afew months and has currently reached 213 countries, with more than 20 millionreported cases and 74 thousand deaths. Ecuador has reported one of the highestrates of COVID-19 in Latin America, with Pichincha and Guayas provinces en-compassing more than 40% of the reported cases. The transmission of COVID-19infection in Ecuador has been the result of contact patterns, mobility structureof the population, regional epidemiology, and efficacy of public health interven-tions. In this study, we link provincial-level demographic, epidemiological, trans-portation, and resources information with the spread of COVID-19 outbreak tounderstand the role of mobility among provinces on the number of infections,hospitalizations, and deaths in provinces of Ecuador. The analysis is carriedout using best (with no inter provincial movement) and worst (with movementpatterns similar to before COVID-19 outbreak) case scenarios in Ecuador. Inaddition, limited and unlimited ICU resources are considered to see if an increaseon the number of ICU beds and ventilators can help counteract a high mobilityamong provinces. The results suggest that human movement (instead of localepidemiology) has primarily been shaping transmission dynamics of COVID-19in Ecuador by introducing infected individuals regularly into low-risk provinces.
1:30 - 1:45
Joseph Zinski, University of Pennsylvania
Networks of Necessity: Simulating Strategies for COVID-19 Mitigation among Disabled People and Their Caregivers
A major strategy to prevent the spread of COVID-19 is through the limiting of in-person contacts. However, for the many disabled people who live in the community and require caregivers to assist them with activities of daily living, limiting contacts is impractical or impossible. We seek to determine which interventions can prevent infections among disabled people and their caregivers. To accomplish this, we simulate COVID-19 transmission with a compartmental model on a network. The networks incorporate heterogeneity in the risks of different types of interactions, time-dependent lockdown and reopening measures, and interaction distributions for four different groups (caregivers, disabled people, essential workers, and the general population). Among these groups, we find the probability of becoming infected is highest for caregivers and second highest for disabled people. Our analysis of the network structure illustrates that caregivers have the largest modal eigenvector centrality among the four groups. We find that two interventions -- contact-limiting by all groups and mask-wearing by disabled people and caregivers -- particularly reduce cases among disabled people and caregivers. We also test which group most effectively spreads COVID-19 by seeding infections in a subset of each group and then comparing the total number of infections as the disease spreads. We find that caregivers are the most effective spreaders of COVID-19. We then test where limited vaccine doses could be used most effectively and we find that vaccinating caregivers better protects disabled people than vaccinating the general population, essential workers, or the disabled population itself. Our results highlight the potential effectiveness of mask-wearing, contact-limiting throughout society, and strategic vaccination for limiting the exposure of disabled people and their caregivers to COVID-19.
Session 4: TDA for COVID Modelling
1:45 - 2:45
Chair: Mariel Vazquez
1:45 - 2:00
Topological data analysis (TDA) treats a data cloud as a geometric object and tries to extract topological features from it. We apply one of the TDA tools, the Mapper algorithm, to the U.S. COVID-19 data. This novel method provides useful visualizations of the pandemic and allows for easy comparisons across time and space. We will explain how these visualizations encode a variety of geometric features of the data cloud based on geographic information, time progression, and the number of COVID-19 cases.
2:00 - 2:15
This paper introduces new unsupervised clustering methodologies for time series data using novel methods from topological data analysis. Cluster identification is improved from extracting persistent topological signatures via persistent homology, which is a modern algebraic method to summarize distinctive features of functions and shapes. We present algorithmic adaptations of two classic clustering methods, i.e. k-means and hierarchical clustering, and analyze the impact of adding persistent features in each clustering process. We assess the performance of our approach on synthetic data, successfully recognizing different types of time series. Finally, visual and statistical results on current COVID-19 confirmed cases, in North Carolina, show that the proposed methods successfully characterize counties with similar COVID-19 dynamics. This results can serve as input for local governments to support their decision-making during current pandemic.
2:15 - 2:30
RNA viruses can cause known severe diseases in humans, such as influenza, HIV, Hepatitis C, Ebola, SARS, MERS, or the recent outbreak of COVID19. The evolution of these viruses is driven by a relatively high mutation rate compared to DNA viruses, as well as genetic exchange events, which involve reassortment and recombination. The latter can occur during co-infection of a cell with two or more related viruses and has been shown to play a key role in the ability of RNA viruses to expand their host range to humans. Vertical evolution is generally acknowledged as the genetic changes occurring over generations, whereas horizontal evolutionary processes include genetic exchange between variants. Understanding these events and their frequency is key to track the evolution of viruses. Phylogenetic reconstruction is one of the most common methodologies to represent species divergence. However, phylogenetic trees fail to capture horizontal events, partially because they are limited to a two-dimensional space. An emerging application of algebraic topology to evolutionary studies, Topological Data Analysis (TDA), allows us to infer horizontal evolutionary events from large amounts of genomic data. Conclusions are drawn from the topological properties of the data represented in higher dimensions, applying mathematically well-founded tools in the framework of persistent homology. In our research, we have applied TDA to study recombination in bat coronaviruses and human SARS-CoV-2. We have been able to identify recombination events occurring among bat coronaviruses and suggest possible ongoing recombination occurring among SARS-CoV-2 variants. Our results provide further insight into the application of TDA to biological sequence analysis and open the door for TDA to become a tool to track the evolution of these viruses in natural reservoirs before they emerge as human-infecting diseases.
2:30 - 2:45
Zhiwei Zhen, University of Texas at Dallas
Covid-19 Forecasting via Deep Learning and Topological Data Analysis
We propose a new approach to forecasting COVID-19 dynamics based on integrating topological signatures of atmospheric factors into Deep Neural Networks, particularly, Long-Short Term Memory models. We validate utility of the new biosurveillance tool in application to tracking and predicting COVID-19 spread in the state of North Carolina, USA
Session 5: Panel
2:45 - 3:45
Panelists :
Dr. Katherine Heller, Google and Duke University
Dr.Padhu Seshaiyer, George Mason University
Panel Moderator:
Alex Volfovsky
Day 2
January 29, 2021 All times in ET
Session 1: COVID Epidemic Modelling I
10:45 - 11:45
Chair: Padmanabhan Seshaiyer
10:45 - 11:00
Each state in the United States exhibited a unique response to the COVID-19 outbreak, along with dynamic levels of testing, leading to different actual case burdens. In this study, we utilize testing data, via a per-capita testing variable ascertainment rate, as well as case and death data, in order to fit a minimal epidemic model for each state. We estimate infection-level dependent lockdown entry and exit rates (representing government and behavioral response), along with the true number of cases as of May 31, 2020. Ultimately we provide error corrected estimates for commonly used metrics such as infection fatality ratio, and overall case ascertainment for all 55 states and territories considered, and the United States in aggregate, to both analyze the United States first wave and suggest potential management strategies for future outbreaks. We generally find an analytically justified inverse relation between outbreak size and lockdown speed, and our simulations suggest a critical population quarantine ``half-life'' of 30 days independent of other model parameters.
11:00 - 11:15
Sabrina Hetzel, Southern Methodist University
Modeling the Spread of COVID-19: A University Study
Understanding, containing, and eradicating COVID-19 is an enormous problem for the scientific community to tackle, and there are myriad facets that need to be solved. Our group in particular wishes to explore the viral spread in a university setting since it is an ideal climate for a COVID-19 outbreak. Up until recently, we aimed to understand how initial mass testing of the student body at the start of the semester, and continual testing throughout the semester reduces the spread of the virus. Looking into the future, we wish to explore the conditions and implications for a nontrivial equilibrium state for the infected population.
11:15 - 11:30
Long Nguyen, George Mason University
Mathematical Models and Simulation of Multiphysics in Enclosed Spaces
Most epidemic models studying COVID-19 have focused on the macro scale, evaluating the progression of infected cases across large regions. However, the significant airborne infectivity of the virus has led to important public policy questions about safety measures in enclosed spaces like schools, aircraft, and hospitals. Alarmingly, there is a severe lack of coronavirus specific literature that models the medium to long term progression of infections in these small spaces. In this work, we introduce a novel multi-physics framework described using coupled Partial Differential Equations (PDE) including a coupled concentration transport model. We conduct a computational simulation study and perform parameter estimation with Physics Informed Neural Networks (PINNs), a novel supervised learning technique that allows us to use our PDE relations as priors to efficiently learn parameters in our system. Our computational results show that PINNs perform well with limited and noisy data.
11:30 - 11:45
"This study presents a multi-stage stochastic programming epidemic compartmental model to address the resource allocation challenges of controlling COVID-19. In this model, we consider the uncertainty of untested asymptomatic infections at each stage and incorporate the short-term migration when formulating the disease's transmission. The proposed multi-stage stochastic program includes various disease growth scenarios to optimize the distribution of resources, such as ventilators while minimizing the total expected number of new infections and funerals. This study also takes into account the time-varying transmission rate due to various government intervention options, such as wearing masks, social distance and lock down. We apply the infections and migration data to forecast the COVID-19 transmission in the most-impacted counties in New York and New Jersey. The results indicate that short-term migration can influence the transmission of the disease significantly. Our model is practical and can be adapted to study other infectious diseases in complex situations."
Session 2: COVID Epidemic Modelling I
11:45 - 1:15
Chair: Hayriye Gulbudak
11:45 - 12:00
Andreea Magalie, Georgia Institute of Technology
Modeling shield immunity to reduce COVID-19 transmission in long-term care facilities
"Interventions in the COVID-19 epidemic have relied predominantly on en masse approaches including large-scale lockdowns and social distancing. In contrast, testing individuals for virus and serological status can enable targeted mitigation, including isolation and immune shielding respectively. Immune shielding denotes the increase in activity by immune individuals (either recovered or vaccinated) thereby diluting the number of interactions between susceptible and infectious individuals (Weitz et al., Nature Medicine, 2020). Prior analyses of immune shielding have focused on population models with homogeneous mixing. Here we examine the effectiveness of immune shielding in reducing outbreak size on small bipartite networks consisting of patients and health-care workers resembling nursing homes or long-term health care facilities. Utilizing a dynamic, network rewiring algorithm, we find that epidemics are mitigated when increasing the proportion of immune health care workers caring for infectious patients and by increasing the proportion of immune patients treated by susceptible health care workers. These immune shielding rewiring principles increase in effectiveness given more frequent rewiring of interactions, e.g., reducing an outbreak by more than half when implemented on a weekly basis. Our strategy proves to be even more effective in preventing secondary outbreaks when a fraction of the population is already immune (e.g., due to partial vaccination). We close by discussing how network-centered immune shielding could enable preemptive measures to decrease the size of future outbreaks."
12:00 - 12:15
Compared with traditional protein-focused strategies, targeting highly conserved crucial regions in the RNA genome can more fundamentally destroy the virus life cycle without mutation concerns. The 84nt frameshifting element (FSE) between SARS-CoV-2 Open Reading Frame 1a and 1b is such a target. Its flexible pseudoknotted structure is postulated to stall and backtrack the ribosome, so that the two overlapping reading frames can be translated properly. Hence, finding ways to destroy or stabilize the pseudoknot can inhibit this frameshifting process and stop the following cascade of protein synthesis. In our work, we use dual graphs to represent RNA structures with pseudoknots. This coarse-grained approach makes our method insensitive to small variations in base pairing, which occur inevitably in RNA structure prediction. We apply an inverse folding tool RAG-IF, developed in our RAG (RNA-As-Graphs) framework, to identify key residues that can destroy or strengthen the pseudoknot with minimal mutations. Our predicted mutations have already gained support from chemical reactivity experiments, and we show a merely 2-residue mutation can dramatically change the FSE conformation. These residues are good targets for gene-editing and drug binding.
12:15 - 12:30
Sofia Jakovcevic, UC Davis
Profile Hidden Markov Modeling to Track Mutations in SARS-CoV-2
Using these pHMMs we can detect and track point mutations, delete mutations, insert mutations, and possible recombination over time.
12:30 - 12:45
The COVID-19 pandemic is caused by the SARS-COV-2 betacoronavirus. A vast international effort to harvest and sequence human SARS-CoV-2 genomes from around the world has led to an unprecedented amount of genomics data, which is invaluable to the study of the virus' evolutionary dynamics. Here, we focus on two previously reported substitutions, S477N and D614G. We study the relative fitness effects of the individual mutant strains and those of the double mutant, S477N/D614G. Using a profile Hidden Markov Model (pHMM), we analyze a data set containing ~150,000 sequences to determine the population dynamics of the three strains between March and November of 2020. These results--combined with a biophysical analysis of binding affinity--lead us to propose that binding effects partially explain the population dynamics of the mutants. We find that S477N has a binding fitness advantage with respect to wild type and that D614G has an advantage with respect to the double mutant, S477N/D614G. We observe that the two strains containing D614G are the most competitive and comprise a nearly bipartite viral population between August and October of 2020. This evidence suggests that the combined S477N/D614G variant has a fitness advantage over the wild type and could be concomitantly more infective.
12:45 - 1:00
"SARS-CoV-2 (CoV) is the biological agent that causes COVID-19, a respiratory disease that has become a global pandemic. Because the virus will evolve molecular-level solutions to maintain its fitness (as already seen in the B.1.1.7 variant from the United Kingdom), it is essential to characterize evolutionary patterns in a high-resolution manner and develop statistical tests for variant associations to phenotypes of interest (e.g., disease severity, geographic location, epidemiological timeframe). Here, we divided the CoV genome into 29 constituent regions and identified nonstructural protein 3 (nsp3) and Spike protein (S) as proteins with the highest variation and greatest correlation with the viral whole-genome variation. We demonstrate that geography and time best explain differences between gene regions of samples. We extend this analysis to different related CoV viruses, including MERS, SARS, and bat coronaviruses. Here too, S and nsp3 explain most of the variation; these two regions also show a high number of sites under selection. Our results provide a direction to prioritize genes associated with health outcomes and inform improved DNA tests to predict disease status and severity."
1:00 - 1:15
Taylor Howard, UC Davis
NOVEL APPLICATION OF AUTOMATED MACHINE LEARNING WITH MALDI-TOF-MS FOR RAPID HIGH-THROUGHPUT IDENTIFICATION OF COVID-19: A PROOF OF CONCEPT
"The COVID-19 pandemic created new challenges within the field of molecular infectious disease testing. The need for highly sensitive and specific tests that are both rapid and high throughput has come to the forefront of scientific consciousness, with the added awareness of strained supply chains. The current gold standard for diagnosing SARS-CoV-2 infection is reverse transcription (RT) polymerase chain reaction (PCR) which often faces a trade off between speed and throughput. Demand for these molecular tests has created shortages and limited allocation of testing supplies. These difficulties have resulted in the proposal of new tests that can both perform rapidly with high throughput and rely on alternate supply chains. One proposed testing strategy for identifying patients with COVID-19 is Matrix assisted laser desorption ionization (MALDI) – time of flight (TOF) – mass spectrometry (MS). Because there is a need for screening of large numbers of asymptomatic individuals, a relatively inexpensive, rapid test is highly favored. We studied MALDI-TOF MS to evaluate its potential to fill this role. Due to its unique nature, MALDI-TOF MS can be rapid, high-throughput and cost-effective while using different supply chains than traditional molecular tests. Because complex spectra are produced when running MALDI-TOF MS samples, we employed machine learning (ML) to analyze and optimize the resulting data. Residual nasal swab samples from adult volunteers were used for testing and compared against RT-PCR. Two optimized ML models were identified, exhibiting accuracy of 98.3%, positive percent agreement (PPA) of 100%, negative percent agreement (NPA) of 96%, and accuracy of 96.6%, PPA of 98.5%, and NPA of 94% respectively. Machine learning enhanced MALDI-TOF-MS for COVID-19 testing exhibited accuracy, PPA, and NPA comparable to existing commercial SARS-CoV-2 tests."
Session 3: COVID Epidemic Modelling II
1:15 - 2:15
Chair: Javier Arsuaga
1:15 - 1:30
Reese Richardson, Northwestern University
Estimating incident SARS-CoV-2 infection detection rates with mortality data
The detection of active SARS-CoV-2 infections is crucial for tracking the spread of the virus and for targeting public health interventions to reduce transmission and mortality. Individuals with a confirmed infection are also more likely to self-isolate than unsuspecting carriers. Moreover, the rate at which incident infections are detected at any given time is a critical parameter in SARS-CoV-2 transmission models that support decisions on public health measures. Estimating infection detection rates requires an approximation of the true number of incident infections, which can be roughly estimated by comparing case fatality rates to the expected infection fatality rate in a given population. However, this estimation is complicated by heterogeneous patterns of detection across age, geography, and severity of symptoms. Here, we present a method for inferring the instantaneous infection detection rate within a given demographic using case fatality rates, age-dependent infection fatality rates, hospital fatality rates, and excess mortality data. Applying this method to the state of Illinois, we estimate that fewer than 10% of all SARS-CoV-2 infections were detected prior to mid-April. Despite continued increases in the volume of diagnostic testing over the course of the pandemic, we estimate that the infection detection rate has yet to exceed 40%. Although the performance of this method degrades for demographics with low mortality (such as younger age groups) and in the absence of robust data, the method could nevertheless be used to gauge infection detection rates across a large population.
1:30 - 1:45
Yue Pan, UNC Chapel Hill
Standardized covid-19 time series data shows a stationary and non-Poisson process
"The analysis applied here was based on daily-infected numbers, which were generated from first difference of cumulative counts of coronavirus cases for each state in the US. Appropriate correction was applied to correct for continuous zeros or negative counts. A local linear smoothed curve was assessed by taking a 14 days trimmed moving average. And standardized daily-infected counts were calculated by dividing the corrected daily positive counts by the smoothed curve. The standardized data were further de-seasonalized by subtracting by the trimmed average counts of each weekday. For each state, the mean-to-variance-relationship shows that the mean of corrected daily-infected number is much smaller than local variance based on 14 days. This shows the data are far from a standard Poisson process, which indicates that models based on Poisson distribution may need substantial modification. After scaling and de-seasonalizing, the time series data shows an impressive stationary property, which is appropriate for further time series analysis. Study of the marginal distributions of these time series failed to find heavy tails."
1:45 - 2:00
Stochastic epidemic models provide a realistic an interpretable description of the spread of a disease through a population. Yet, fitting these models in missing data settings is a notably difficult task; in particular, when the epidemic process is only partially observed, the likelihood of the model does not have a closed form. To remedy this issue, this article introduces a data-augmented MCMC algorithm for fast and exact inference for the stochastic SIR model given discretely observed infection incidence counts. In a Metropolis-Hastings step, the algorithm proposes event times of the augmented data according to a stochastic process whose dynamics closely resemble those of the SIR process, and from which we can efficiently generate an epidemic that is compatible with the observed data. Not only is the algorithm fast, but, since the augmented data are generated from a very faithful approximation of the target epidemic model, the algorithm can update a large portion of the augmented data per iteration while maintaining a relatively high acceptance rate, thereby exploring the high-dimensional latent space efficiently. While existing MCMC approaches that do not rely on model simplifications or approximations become intractable for populations greater than a few thousand individuals, the proposed algorithm scales to outbreaks with hundred-thousands epidemic events even on a single laptop. We validate its performance via thorough simulation experiments, and a case study on the $2014$ Ebola outbreak in Western Africa.
2:00 - 2:15
Epidemics are a public health issue whose importance is increasing more and more. An apropiate description of their behavior and a good prediction of their development over time is crucial for decision-making that allows to preserve lives. In general, epidemiological models such as the SIR model and its main variants (SI, SIS, SEIR, SIRS), were designed for closed populations and do not consider the population distribution in the area where the epidemic develops. In the case of the COVID-19 pandemic, local outbreaks are reported that occur at different moments of time and spread at different speeds, which are the consequence of demographic phenomena. Hence, it is important to propose a model that considers the migration of infectious people as a relevant factor, describing the effect that neighboring places have on a specific geographic area, the dynamics of each of the regions of a state and how these dynamics affect in the larger territory to which it belongs. One way to do this is through network diffusion processes. In this work we present a model based on SEIR model, in which a diffusion process over a graph representing communities in the state of Jalisco, grouped according their infection level. The aim of this methodology is to reproduce the numbers of infections reported by Secretaría de Salud in the state of Jalisco, México, and at the same, explain the infections within population groups at different hierarchical levels. We report the results of our findings, based on COVID-19 contagion data from the period July 23 to August 31, 2020, and show capabilities of our approach for capturing infection spread due to demographic phenomena.
Session 4: COVID Epidemic Modelling III
2:15 - 3:15
adey@utdallas.edu
2:15 - 2:30
Coronavirus disease 2019 (COVID-19) is a pandemic. To characterize the disease transmissibility, we propose a Bayesian change point detection model using daily actively infectious cases. Our model is built upon a Bayesian Poisson segmented regression model that can 1) capture the epidemiological dynamics under the changing conditions caused by external or internal factors; 2) provide uncertainty estimates of both the number and locations of change points; 3) adjust any explanatory time-varying covariates. Our model can be used to evaluate public health interventions, identify latent events associated with spreading rates, and yield better short-term forecasts.
2:30 - 2:45
Karnika Singh, Duke University
Physiological Changes Detection From Smartwatch Data Associated With COVID-19 Infection
"Early, and ideally, pre-symptomatic detection can limit the spread of a contagion and inform resource allocation, particularly with high-risk pathogens, such as COVID-19 or influenza. There are physiological changes associated with infection onset before the appearance of obvious symptoms that can be leveraged for timely infection detection. We have developed CovIdentify, a platform to enable early detection of COVID-19 infection using wearables data and data from surveys. Early analysis of wearables data reveals changes in physiological signal before an individual tests positive for COVID-19 infection. We aim to leverage these signals for digital biomarker development for COVID-19 infection and to develop CovIdentify as a platform for early detection of infection. "
2:45 - 3:00
Kirthi Kumar, University of California, Berkeley
Mathematical Modeling, Analysis, and Simulation of the COVID-19-Amplified Opioid Crisis with Prescription and Social Drug Addiction Models
Opioid overdose rates are rising, and data suggests that both prescription and illicit drugs are significant with respect to the opioid epidemic according to the CDC. Unfortunately, with the onset of the COVID-19 pandemic, social isolation pain is increasingly motivating opioid misuse. The typical framework for mathematical models involves the classic compartmental model involving ordinary differential equations (ODEs) that describe phases of infectious diseases. Mathematical modeling for drug addiction as an infectious disease model can be a novel and unique method of approaching this issue. This project explores two mathematical models, including one describing the dynamics of addiction through over prescription, and the second describing the dynamics of addiction influenced by social behavior as a coupled system of ODEs. In addition, parameters are estimated with machine learning; a rural and urban prescription models are compared ; a basic reproduction number for the social model is derived; efforts to support healthcare providers and education as a control for the opioid epidemic using enhanced models as well as optimal control theory is explored; a numerical simulation of each model is implemented with high-order numerical approximation; and a graphical user interface (GUI) is developed through MATLAB software. This GUI is highly useful for these COVID-19 times when parameters are ever-fluctuating. This project can be applied to improve the state of affairs relating to the opioid epidemic and can be further extended through clinical trials for parameter estimation, introduction of technological rehabilitation, and robust models.
3:00 - 3:15
Soheil Saghafi and Emel Khan, New Jersey Institute of Technology
Circadian Rhythms and COVID-19: Modeling Circadian Clock Regulation of Immune System Response to SARS-CoV-2 Infection and Treatment with Remdesivir
Circadian clocks regulate many aspects of human physiology, including the response of the immune system to viral infections. We extended a mathematical model of SARS-CoV-2 dynamics fit to viral load data from several COVID-19 patients (Goyal et al, Sci Adv, 2020) to study how circadian variation of the model parameters affects the amount of time required to clear the virus. We found that circadian rhythms in the death rate of the virus reduced the time required to clear the virus, whereas rhythms in the production rate of the virus increased it. We also found opposing effects for circadian variation of immune system parameters: circadian rhythms in the innate immune response reduced the time required to clear the virus, whereas rhythms in the adaptive immune response increased it. We then used the model to explore whether the time of day that the antiviral therapy remdesivir is administered affects its efficacy. Our preliminary simulations show that the size of the time-of-day effect depends on the potency of the drug and if it is administered early or late in the course of the infection, with the largest effects seen for low to medium potency remdesivir treatment beginning in the pre-symptomatic phase.
Plenary Talk
Andrea Bertozzi, UCLA
3:15 - 4:15
Epidemic modeling – basics and challenges
Chair: Ignacio Segovia Dominguez
Epidemic modeling – basics and challenges
I will review basics of epidemic modeling including exponential growth, compartmental models and self-exciting point process models. I will illustrate how such models have been used in the past for previous pandemics and what the challenges are for forecasting the current COVID-19 pandemic. I will show some examples of fitting of data to US states and what one can do with those results. Overall, model prediction has a degree of uncertainty especially with early time data and with many unknowns. I will also speak about the current outbreak in Los Angeles and how the LA County Hospital Demand Modeling Team is addressing that.
Session 5: Time-varying and Heterogeneity Properties of COVID Models
4:15 - 5:00
Chair: Jason Xu
4:15 - 4:30
We are interested in using observed disease incidence data to probe associations between rate of epidemic spread and time-varying covariates (e.g. local mobility patterns and air quality). Such information may serve as basic research or a reference in public health policy evaluation. We propose a novel Bayesian approach that leverages the mechanistic assumptions of the SIR model, and demonstrate its use on simulated data as well as data from the COVID-19 epidemic in California. In addition, we compare performance with relevant alternative methods and discuss certain statistical paradoxes to which naive approaches may fall prey.
4:30 - 4:45
Rui Liu, The University of North Carolina at Chapel Hill
Suitability of Time-varying SIR Models
Useful insight into the evolution of pandemics comes from SIR models. These reveal a key time varying parameter R0, which reflects the number of people each infected person will infect in turn. Such models are trained on infected, recovered and dead data and we investigate each state in the USA. We find that in most states the recovered data are very unreliable which renders the SIR models unusable. In some states, recovered data did not update everyday, which caused the cumulative record to not change in consecutive days. More serious is that in many other states, the sum of cumulative recovered and dead data is much lower than the cumulative infected data which seriously impacts time varying R0 estimates. We study this by defining a new measure: estimated recovery time (ERT) key to assessment of reliability of the data. The states whose data has passed the assessment can be used in further analysis.
4:45 - 5:00
We develop a stochastic epidemic model on dynamic networks, where individual covariates can lead to heterogeneous infection rates. The epidemic and dynamic network processes are defined as a co-dependent continuous-time Markov chain, such that disease transmission is constrained by the contact network structure, and network evolution is influenced by individual disease statuses. To accommodate partial epidemic observations commonly seen in real-world data, we introduce a likelihood-based inference method inspired by the stochastic EM algorithm to estimate model parameters. Experiments on both synthetic and real datasets demonstrate that our inference method can accurately and efficiently recover model parameters and provide valuable insight at the presence of missing infection and recovery times in epidemic data.
Session 6: Modelling Socio-Economic Impact of COVID-19
5:00 - 6:00
Chair: Dorcas Ofori-Boateng
5:00 - 5:15
Kavya Ravishankar, The University of Pennsylvania and George Mason University
Modeling, Analysis and Control of Student Loan Debt using Epidemiological Models
Student loan debt is a debilitating problem that threatens a large subset of the American population. As of February 2019, the total amount of debt in the U.S. due to student loans amounted to $1.56 trillion. This paper works to mathematically model the student debt situation from the lens of an infectious disease contagion model. The study describes a belief proliferation model. Specifically, the spread occurs through the unfounded external reassurance to students that the value of their college education will amount to a future job that will enable them to pay off their loans in full and on time. Built on the classical SEIR compartmental model of epidemiology, this study analyses the movement of individuals in the study set from the susceptible stage to the recovered stage using interconnected differential equations. We additionally consider an enhanced model to study the potential effect of an educational awareness program and the financial strain of the COVID-19 pandemic through respective optimal control variables. Utilizing Pontryagin's maximum principle, the augmented model determines the ideal control value to mitigate the rate of students refinancing their loans when unable to meet the required payments.
5:15 - 5:30
ASIM DEY, Princeton University and The University of Texas at Dallas
Impacts of COVID-19 local spread and Google search trend on the US stock market
"We develop a novel temporal complex network approach to quantify the US county level spread dynamics of COVID-19. The objective is to study the effects of the local spread dynamics, COVID-19 cases and death, and Google search activities on the US stock market. We use both conventional econometric and Machine Learning (ML) models. The results suggest that COVID-19 cases and deaths, its local spread, and Google searches have impacts on abnormal stock prices between January 2020 to May 2020. In addition, incorporating information about local spread significantly improves the performance of forecasting models of the abnormal stock prices at longer forecasting horizons. On the other hand, although a few COVID-19 related variables, e.g., US total deaths and US new cases exhibit causal relationships on price volatility, COVID-19 cases and deaths, local spread of COVID-19, and Google search activities do not have impacts on price volatility."
5:30 - 5:45
Scott Blender, Temple University
Using Human-Centered Mobility to Predict Local Economic Recovery
On March 12th, 2020, the World Health Organization declared the novel coronavirus disease 2019 (COVID-19) a global pandemic. As of January 5th, there have been 20,558,489 cases in the United States. Due to initial spikes in cases in mid-March, the United States imposed stay-at-home restrictions to limit the spread of the coronavirus. Due to the effects of these restrictions, numerous industries, supply chains, and producers were dramatically impacted, leading to a pandemic recession. As of late, researchers have been leveraging mobility datasets from companies such as Google, Apple, and Safegraph to interpret and understand the effects of stay-at-home orders and how restrictions are evolving as the pandemic progresses. Traditional economic models and forecasts have multiple limitations due to the shock the pandemic initially presented and the need for higher-frequency data to observe evolving trends. For this specific study, I plan to investigate how economic recovery can be measured by utilizing mobility datasets from Google and Safegraph to see how “mobile” an area is, and in turn, correlate mobility to economic recovery. Using mobility will also allow me to investigate correlations between mobility and the reduction in active labor force, air quality index, household income, and real estate data provided by realtor.com to model real-time economic recovery and provide further insights into the economic implications of the COVID-19 pandemic.
5:45 - 6:00
SARS-CoV-2, the virus that causes COVID-19, was first confirmed in the United States on January 19th, 2020, when a man in a Snohomish county urgent care presented with a cough and fever. One year later, COVID-19 has spread to every state in the union, with more than 21 million confirmed cases and over 367,000 related deaths. RAND has developed an epidemiological model that describes how Nonpharmaceutical Interventions (NPIs) can delay the spread of the virus, and a general equilibrium model to estimate the economic effects of these interventions. This presentation builds on this work and further relaxes structural and parametrical assumptions in this model, employing the Robust Decision Making (RDM) approach to investigate the robustness of reopening policies given the uncertainties surrounding the rollout of vaccines. We analyze the tradeoffs among health and economic outcomes implied by a range of reopening and vaccination policies, as well as suggest the circumstances under which non-dominated strategies might fail. Our findings highlight the need for a cautious reopening plan that accounts for uncertainties and the need to achieve both health and economic societal goals.