Epidemics are traditionally viewed through the lens of time. Most simply, we can consider an SIR (susceptible-infectious-removed) dynamical system. In this, many simplifying assumptions are made; however, it has been shown to work well for analyzing general epidemic trajectories. A visual of the trajectories can be seen on the left.
In this model, infection occurs at a rate proportional to the susceptible and infectious population sizes and recovery occurs proportionally to the infectious size. This can be can be thought of as "interaction is necessary for infection." Without infectious or susceptible people, no new infections will occur.
Traditionally, epidemiologist track an epidemic over the course of time. This feels very natural for us, as humans; however, we show that this leads to unnecessary uncertainty when making predictions and analyzing the spread. The bottom plot on the left is of an "epi-curve." It shows the incidence (new cases) as a function of time. This should not be confused with the infectious curve (orange in the top figure).
The problem arises when trying to make predictions based on one curve. The various curves are taken from random simulations of the same disease on the same network. It can be difficult to distinguish between the yellow curve and the others. This is because we analyze the epidemic from the perspective of time.
We start by considering 6000 simulations with the same parameters of those above. We plot their epi-curves on the right. While there is a general shape to them, there is significant spread between each realization. With only one random curve (or an actual epidemic unfolding), it can be difficult to estimate disease characteristics, like infectivity, and when peak incidence would occur.
However, if consider tracking the epidemic in terms of cumulative cases instead of time, all 6000 realizations appear to align very nicely along an average curve. Not only does it bring the curves more in-line but, an equation for the black dotted line is known.
This comes from looking at an epidemic as a random process where either an infection occurs or it does not (recovery). We then count the number of cases per day and can generate the plot to the right. The key finding from this is that the each event (infection or recovery) is independent from all previous events. That means, it does not matter what happened in the past in order to predict the future. This is not the case for analysis from the time-perspective.
Not only does this allow us to make more accurate predictions, but they can be done without the need for numerical simulations or intense computations. Instead, if we know the number of cases, we can predict the number of new cases the next day by doing simple statistical calculations.
This also allows for easy comparison across different populations and locations. Seen below, we simulate the same disease on different size populations. The epi-curves are difficult to distinguish a general pattern; however, when we look at cumulative cases, both scaled, and unscaled, we can see the hidden universality of the spread.
We can apply this to the COVID-19 pandemic. Specifically in Arizona, analyzing the epidemic from the perspective of time, it is difficult to see when public health policy takes effect, whether it has impact, and the general trajectory of the pandemic. However, when we consider the cumulative cases perspective, it can be much easier to extract disease and spread properties.
For Arizona, we can see three distinct waves in 2020. The first occurring from March to May, the second from May to August, and the third, through the end of the year. However, it is difficult to distinguish when these start and stop within the epi-curve.
Not only does identifying waves easier, but, we can see that the R_0 value (basic reproductive number) does not vary greatly between the waves, instead, the number of people potentially exposed to the epidemic grows. At the beginning of the epidemic, only ~50,000 people at immediate risk of contracting the virus. However, after May 15, when the stay-at-home order was lifted, this number grew to ~280,000. Finally, during the third wave, nearly the whole population of Arizona becomes part of the population interacting with the disease.
Finally, this method allows for distribution of final epidemic size (a previously known result) and the distribution of incidence at the peak of the theoretical ICC curve. This means that we can give confidence intervals for the final epidemic size based on known data and how bad the epidemic will be at its worst. These calculations can be done without the need for hours of CPU-time in simulations or more complex statistical methods like MCMC. Instead, these calculations can be done in seconds on a laptop PC.
Making accurate predictions during a pandemic can be an unforgiving task, especially when these predictions are used to inform public health. However, the ICC statistical analysis can offer a new perspective for this analysis that, when used in conjunction with the many other highly accurate methods, can be used to make better predictions. No one method should be the end-all-be-all for public health predictions. But, combining various perspectives, from simple (ICC) to complex, can help in the total understanding of epidemic outbreaks and what can be done to reduce their spread.