The Curious Case of Covid-19 in Kerala
First published: April 27, 2020. Updated May 2, 2020.
Dr. Jayakrishnan Unnikrishnan and Dr. Sujith Mangalathu
[pdf]
First published: April 27, 2020. Updated May 2, 2020.
Dr. Jayakrishnan Unnikrishnan and Dr. Sujith Mangalathu
[pdf]
Unnikrishnan, J., Mangalathu, S., Raman kutty, V., "Estimating under-reporting of Covid-19 cases in Indian states: an approach using a delay-adjusted case fatality ratio", BMJ Open, (In press, Preprint-copy).
In the midst of the uncertainty and uneasiness Covid-19 has created globally, it would be inequitable to turn a blind eye to how the small state of Kerala in India has been dealing with the pandemic. The efficiency displayed by the state in handling the pandemic, puts forward a curious case to drill down on how it is dealing with the pandemic from a scientific perspective. The existing measures deployed to tackle the pandemic in the state nonetheless posits a pertinent question of interest on what fraction of the true infections within the state have currently been recorded. A data-driven analysis reveals the extent of testing that has been carried out in Kerala, and also quantifies the level of under-reporting of cases that may currently exist within the state.
Kerala is at par with many developed nations in terms of social development indicator metrics such as life expectancy, infant mortality and literacy. However, when the first positive case of Covid-19 in India was reported in Kerala in a student returning from Wuhan, China, on 30 January, even developed countries such as Italy and China were struggling in containing the spread of the pandemic. Moreover, given the significant non-resident Indian (NRI) population among Keralites, and the popularity of the state as a tourist destination in the winter months, the state was arguably in a particularly vulnerable position. In spite of the challenges, Kerala seems to have survived the onslaught from the merciless adversary, at least for now. Kerala has so far recorded only 3 deaths, which is under 0.4% of the national death toll, and about a seventh of the national death toll per capita. The low death rate from the disease and the successful recovery of Covid-19 infectants aged beyond 80 stand out nationally as well as internationally. We present a closer analysis of a few key metrics on the response to the spread of the pandemic within the state and also apply a model on the data to estimate the current state of infection within the state.
One of the key factors that helped the state take control of the outbreak was the extensive testing of foreign arrivals and contact tracing of suspected cases that was carried out in the early days of the outbreak. This is reflected in key testing rate metrics. A comparison of the number of tests carried out by the state (Figure 1 and 2) to detect a positive case is more than twice the national average of India and well beyond the rates of developed countries such as the United States, United Kingdom and Switzerland. Compared to the national average, Kerala has also tested a lot more people per capita.
Figure 1: Comparison of Kerala with countries around the globe as on May 1, 2020. Source: ourworldindata.org, dhs.kerala.gov.in
Figure 2: Comparison of Kerala with countries around the globe as on May 1, 2020. Source: ourworldindata.org, dhs.kerala.gov.in
The extensive testing phase followed by the lockdown of key districts and the statewide lockdown helped curtail the spread significantly. The total count of reported active cases peaked at 266 within the state on April 6th and has now decreased to 123. As the state now contemplates opening up its businesses and districts, it is essential to ascertain the true spread of the disease. Recent studies based on antibody tests in the United States (US) have found that the fatality rate of Covid-19 is lower and that the disease spreads faster than thought earlier. It is therefore very much possible that the true infection rate in Kerala is not accurately captured by the currently reported active cases.
Figure 3: Covid 19 case evolution in Kerala
A pertinent question of interest for Kerala is what fraction of the true infections within the state have currently been recorded. Is it likely that a significant number of cases have gone unnoticed in spite of the strong testing protocols? Some recent works provide a path towards estimating this quantity, and we extend that study to Kerala. In a nutshell, the method works as follows. First, obtain an estimate of the true death rate or case fatality rate of Covid-19. Next, use this estimate along with the recorded number of deaths due to Covid-19 to estimate the likely number of infections in the state. We describe the procedure below.
The potency of a viral disease is typically quantified with a metric called the case fatality rate (CFR), which is defined as the fraction of infected people who die due to the disease. In the case of Covid-19, the best estimate for the CFR coming from other studies varies a lot. In a recently published study on a large number of patients the best estimate is 1.4% [1]. However, this is likely to be an overestimate as testing for cases was performed primarily only on symptomatic people. A different published study based on data from China puts the estimate at 0.66% [6]. More recent antibody based tests in the US have revealed that the extent of the infection is much more than assumed earlier leading to lower estimates of the CFR, as low as 0.5% [2].
The ratio of deaths to cases measured while the outbreak is in progress is referred to as the naive case fatality rate (nCFR). The nCFR is typically an under-estimate of the true CFR as the correct deaths may not completely capture the total deaths that could eventually occur among the current cases. In Kerala, as of April 26, there have been a total of 468 positive cases and 3 deaths recorded, which gives a nCFR estimate of 0.64%. An improved estimate of the CFR can be obtained by accounting for the additional deaths that are likely to occur among the currently active cases. This estimate is referred to as a corrected case fatality rate (cCFR). In a recent work [3], it was argued that one can use the distribution of the delay between the time when an eventually mortal case gets recorded and the time of death. They use a Lognormal distribution for the delay with a mean delay of 13 days and a standard deviation of 12.7 days reported in prior work [4]. By applying the same methodology to the data from Kerala, we arrive at a cCFR estimate of 0.84%.
The ratio of the true CFR of Covid-19 to the cCFR observed in Kerala provides an estimate of the level of underreporting of cases in the state. As discussed earlier, estimates for the true CFR of Covid-19 have varied from 0.5% to 1.4% in different published studies. Using a value of 0.5% for the true CFR estimate gives an estimate of 67.7% for the percentage of cases reported, and using a value of 1.4% for the true CFR estimate gives an estimate of 100% for the percentage of cases reported. This means that the actual number of people in Kerala who have been infected could be as high as 1.5 times the number of positive cases reported so far. In other words, in addition to the reported 500 positive cases, there may be up to 239 people in the state who have had the infection without having had a positive test*. This could include people who have had the infection and have now recovered without having been tested, as well as those who may still be showing symptoms, but have not been tested. Similarly, if we further assume that the current underreporting rate is equal to the average underreporting rate, we can estimate that the true number of active cases currently in the state is likely to be close to 142 which is 1.5 times that of the currently reported number of active cases of 96.
The above calculations give an estimate of the actual level of infection in the state based on the available data. It should be noted that the accuracy of this analysis depends greatly on the quality of the data that is available. For instance, if the number of deaths reported is under-counted that could have an immediate impact on the predicted underreporting rate. Accuracy is also significantly dependent on the number of reported deaths and reported cases, and will be higher when these numbers are higher. With these caveats in mind, we now interpret the estimates. The fact that the true rate of infection is higher than the reported rate is expected, as it is obvious that not every single case has been tested. However, the observation that the model predicts that the true number is only off by a factor under 2, even with the lowest known estimate of 0.5% for true CFR is reassuring. As a comparison, the model applied to data from the US predicts that the true number of cases in the US may be up to 10 times the currently reported number [3]** even for an assumed CFR of 1.4%. Nevertheless, the fact that the model predicts that only about 70% of the cases have been truly reported should give caution in decision-making about testing and lockdown reversal. This underreporting could very well explain the new cases that have popped up in different districts within the state without any obvious sources. Opening up the state too early can lead to more acute outbreaks in subsequent waves as was the experience in the island of Hokkaido in Japan.
The root-cause of the under reporting could be one of two reasons. One, not all people with infections have been tested, and two, some tests on infected people falsely returned a negative result. The way to address the first issue is to broaden the testing protocol. The requirements for testing contacts of known cases, and of those arriving from regions with severe outbreaks may be strengthened further. This would be particularly important as more people are expected to return to the state from foreign countries. Furthermore, once the monsoon season starts in June, and more viral diseases with Covid-19-like symptoms start spreading within the population, it will be even more challenging to distinguish true Covid-19 cases from other cases of other viral infections. In this scenario, it will be important to keep up rigorous testing protocols to ensure early detection of possible Covid-19 clusters. A possible strategy there would be to incorporate random testing of people with symptoms regardless of whether or not they are suspected of having been exposed to Covid-19. For the second issue, the solution is to use testing methods that are more accurate. Multiple countries have reported false negatives from commonly used tests. Antigen tests are known to fail on patients who are tested at a time when they are not showing symptoms, and RT-PCR tests are known to be effective only in the first week of the disease when the virus is present in the throat [7]. More recent antibody tests seek to identify individuals who have had the disease any time in the past by looking for antibodies. However, the accuracy of such tests are still not known very well, and these tests are prone to false positives in addition to false negatives. Recently, some countries have proposed to use multiple testing modalities in combination to address the issue of test accuracy [8]. This is one strategy that could be adopted, in addition to using testing methodologies that have been verified and approved for use in multiple countries.
The experiences from the past few months from around the world has shown us how potent a foe we are dealing with in Covid-19. Kerala has fared well so far in coping with the first waves of the infection. As it prepares to open up after the lockdown, it would be prudent to maintain the steadfastness in testing and tracing while stressing the need for minimizing non-essential travel and maintaining social distancing to the extent possible. Broader testing policies for arrivals from regions with severe outbreaks, suspected cases and their contacts could be enforced. Emphasis should be made on using testing protocols and methodology with verified high accuracies. Moreover, in places such as railway stations and markets, where people are likely to congregate in large numbers, it would be important to have temperature monitoring systems, and admission should be allowed for only people without high temperatures, as was the policy followed in China. Persons recording high temperatures may also be randomly selected for Covid-19 testing. Besides these measures for improving testing, policies to minimize spread, such as providing access to hand sanitizers in public spaces, limiting the allowed density of persons per square foot in shops and other public spaces, could also be adopted.
Code availability:
The Jupyter Notebook python code for the statistical model is provided in the the link https://colab.research.google.com/drive/1l3axGGRVtjAUFPahFnxsrHFtw69QeW3R#scrollTo=B-r0LZXPNyMg
About the authors:
Dr. Jayakrishnan Unnikrishnan, originally from Kerala, is an autonomous driving researcher with expertise in statistical signal processing and machine learning. He holds a PhD degree in Electrical and Computer Engineering from University of Illinois at Urbana-Champaign and an undergraduate degree in Electrical Engineering from IIT Madras.
Dr. Sujith Mangalathu is a data scientist and machine learning expert. He completed his Ph.D. from Georgia Institute of Technology, United States. His alma maters include University of Pavia, Italy, Indian Institute of Technology Madras, and Kerala University.
Assumptions made in the study:
Deaths are correctly reported (Note: if the reported deaths are lower than true deaths, the model would predict an even higher level of under-reporting)
The true case fatality rate of Covid-19 is > 0.5%
The distribution of the time between report to death follows the log normal distribution used in the study