The analysis shown here has been performed by Claudia Pasquero on data provided from the Italian Civil Protection Agency every day in late afternoon. The data can be retrieved at https://github.com/pcm-dpc/COVID-19. Input from Jost von Hardenberg on non linear data fitting is gratefully acknowledged.
WARNING!!!! May 19th. We see effects of the relaxation of the confinement measures, and they clearly are not good! In Piedmont, the estimated number of newly hospitalised people (obtained as the sum of the change of the number of hospitalised people and the number of deceased people) became positive again, in the last few days. The same tendency is seen in the Italian database.
I have made a comparison between the official death count from COVID-19, and the anomalous death count (with respect to the 2015-2019 mean) from ISTAT. Notice that the latter has been done on a fraction of the total Italian municipalities, accounting for more than 1/3 of the total population. They have been chosen as those municipalities which have had at least an increase of 20% in the 2020 death rate, with respect to the mean of the previous years. The actual number of deceased people could be much larger than what has officially been counted.
March 23rd: I am afraid the metric that I used so far to define the spread of the virus (total number of people that have been hospitalised, including deceased and recovered) is no longer appropriate: Too many hospitals in the North of Italy reached capacity. The data became very noisy, such as, for instance, on March 20th when some regions, including Piedmont, decided to dismiss several hundred sick people that they previously hospitalised (I guess because they wanted to have beds available for incoming patients with more serious conditions). Unfortunately, we don't know how many were dismissed from the hospitals. Similarly, on March 23rd the large reduction in the daily number of hospitalized people is in the Lombardia region, where the hospitals are at capacity and cannot get any more people. This is the daily increase of hospitalised people. It started to become very noisy over the last week, when serious problems related to the hospital capacity started to affect the hospitalisations. The large swings in the curve are associated to regional policy changes, that caused the dismissal of people that would otherwise have stayed in the hospital.
The different panels in the figure below refer to time series of 1) all hospitalised people, including those who left the hospital already, either because they recovered or because they died (top left panel); 2) all people who tested positive (top right panel); 3) all people currently in intensive care unit (bottom left panel); 4) all deceased people (bottom right panel). Please keep in mind that the number of positives is not necessarily a measure of the spread of the disease, as it is highly dependent on the number of tests performed. It is believed that the known cases of infected people is a severe underestimation of the actually infected people. The fraction of unknown cases could be changing over time, as different protocols for performing tests have been used. On the other hand, the number of hospitalised people might also be biased as doctors might have become more cautionary in hospitalising, considering that the number of available places in intensive care units is decreasing.
Actual data are shown as stars, while the lines correspond to exponential functions, obtained by fitting the data from Feb 24th to today (red lines) and by fitting the data from Feb 24th to Mar 6th (yellow lines). It can be noticed how the spread has reduced since the beginning out the outbreak.
The exponential growth is expected by the simple consideration that any infected person infects a fixed number of people, P. If any infected person infects Q individuals, with Q<P, the disease spreads at a slower pace, and the growth curve falls below the previously computed exponential line. The confinement of people (limitation to their social contacts) is the way that Italy and many other countries have chosen to limit the exponential growth.
In the figure below, the same data as above are shown in a semilog plot, where straight lines indicate the exponential growth, and the data are extrapolated for the next 30 days, leading to one million of deceases by April 22nd. Clearly, this situation MUST change.
Finally, we show in the figure below the doubling time obtained by fitting an exponential function to the data from February 24th to the date indicated on the abscissa axis. A longer doubling time indicates that the desease is spreading slowlier. The increase in doubling time shown for people in the hospital since the beginning of March is a good sign as it indicates a slowdown of the virus spread. Please note that the data in the first few days of the plot are not reliable as they are obtained by fitting an exponential line to few data points that are highly variable because of their small values (the dataset is not large enough to obtain reliable estimates of the doubling time).
All the plots indicate that the total number of people who are or have been in the hospital has been growing significantly, but with a doubling time that has been increasing (from about 2 days to over 6 days) since the beginning of March, suggesting a reduction in the number of infected people per sick person. This suggests that the confinement measures taken by our administrators might be effective. We need more data in the next days to see whether this behaviour continues. It is also possible that the increase in the number of hospitalised people has been affected by a more cautionary hospitalisation procedure, considering the limited number of places available in intensive care units. Please notice that the intensive care unit data account for the people that are in intensive care at each day, while the people that have been in intensive care and are no longer there are not included, so this is not really a cumulative function (whereas the "hospitalized+deceased+recovered" refers to all people who are in the hospital today, plus those who left the hospital because they recovered or died). The increase in the doubling time for the number of deceased people seems to lag behind the other curves. This is expected, because the number of deceased people mainly depends on how many people were in severe conditions in the previous days (on average, death occurs about two weeks after infection), rather than on the number of infected people in the corresponding day, leading to a time lag between the actual reduction of infected people and the reduction of deceased people. In any case, the doubling time for deceased people has been increasing starting from March 10th. We stress that this is a very positive result, as the number of deceased people is a reliable measure of the evolution of the outbreak, and it does not suffer of the possible biases associated with the number of people with positive test, nor with the hospitalisations. (However, it is possible that some people die at home without being tested, especially in these really hard times with hospitals having reached capacity).
Another indicator of the spread, which can be defined based on the daily variations between any day and the previous day only, is the relative change of the number of people who have been hospitalised (i.e. the increase from the previous day divided by the previous day value). When this relative change decreases, the spread of the disease is getting slower. Sometimes, mathematicians also use log(N_{i+1}/N_i) as another metric for the same measure, where the ratio is between the number of hospitalised people in two consecutive days. Those plots are shown in the figure below, and a noisy reduction is visible up to March 13th (notice that the data on March 7th is flawed, as by communication from Civil Protection Agency, because some data from Brescia were missing). Probably the decrease was due to social distancing over some regions in Italy, imposed in various degrees starting on Feb 25th. Over the week March 10th-18th , however, the curve has levelled off. It then resumed the decrease, when the nationwide social distancing imposed on March 11th probably started to impact the spread of the disease. However, the data become unreliable due to the fact that several hospitals reached capacity.
The simplest way of modeling the spread of the disease taking into account limiting factors (such as difficulties found by the virus in reaching new people), is by using a logistic function. However, the logistic fit is valuable only when the growth curves get close to the inflection point (i.e. the so called "peak" of the spread). We have no indication that we are close to it.
Some analysis by regions is shown here, with in black the cumulative number of hospitalised people, in blue the cumulative number of positive people, and in light blue the cumulative number of tests performed, following the region code:
13 'Abruzzo'
17 'Basilicata'
18 'Calabria'
15 'Campania'
8 'Emilia Romagna'
6 'Friuli Venezia Giulia'
12 'Lazio'
7 'Liguria'
3 'Lombardia'
11 'Marche'
14 'Molise'
1 'Piemonte'
16 'Puglia'
20 'Sardegna'
19 'Sicilia'
9 'Toscana'
4 'Trentino Alto Adige'
10 'Umbria'
2 'Valle d''Aosta'
5 'Veneto'
13 'Abruzzo'
Note that for the most part of March the growth of the number of people which tested positive grew at the same rate as the number of tests performed, as it can be seen in the bottom graph, where the two lines, after an initial phase of actual spread of the disease (measured as the fraction of positive tests to the total number of tests performed) grew, become parallel. It is also shown (top panel) the daily fraction of new positive people to the new tests performed.
This means that the data of the number of positive people cannot be used as a proxy for estimating the spread of the disease, not in March. when the growth of the number of positive people is merely reflecting the growth of the number of tests performed. The situation improved in April, when the fraction of positive tests became less than 10%, the threshold indicated by the OMS for fully capturing the spread of the disease.