Using Google trends data as an instrument for policy implementation

The aim of our analysis is to study the impact of lockdown policies as well as other movement restriction policies (such as workplace and school closures) on the spread of the COVID-19 infection. However, in order to obtain a causal inference, we need to tackle the endogeneity issue that is likely to arise arise.

Endogeneity, that is a common issue in econometrics, can be defined as follows: it refers to situations in which an explanatory variable is correlated with the error term.

In a more direct way, endogeneity could be explained by the following elements in our analysis:

  • The omitted variable issue: One or several variables that both impact the Log of cumulative deaths/cases and the implementation of containment policies. If the epidemic is characterized by a strong and fast spread, the number of cumulative deaths and cases will rise quickly and it will encourage governments to quickly undertake strong containment policies. Therefore, the dynamic of the epidemic will be strongly correlated with the dependent variable (the cumulative number of deaths/cases) and with the independent variable (the implementation of containment policies).

  • Simultaneity: We could also consider that the cumulative number of deaths/cases and the implementation of containment policies are co-determined, each one affecting the other. It must be said that it is pretty plausible since the implementation of a containment policy is likely to have an impact on the spread of the virus and, on the other hand, because the dynamic of the virus is likely to push governments to quickly undertake containment measures.

To correct bias that could arise from this endogeneity issue, we will rely on an instrumental variable approach. An instrument should satisfy the two followings requirements:

  • The instrument should be correlated with the exogenous explanatory variable, conditional on the other covariates. In our analysis, this point means that our instrument should be correlated with the variable corresponding to the implementation of containment measures.

  • The instrument cannot be correlated with the error in the explanatory equation, conditionally on other covariates. This mean that the instrument should not suffer from the same problems as the original predicting variable. This is the so-called exclusion restriction.

In our analysis , we will use as an instrument Google trends index for the search term "Coronavirus". As explained here, the Google trends index provides a metric of the volume of Google queries by geographic location for a chosen time period.

Our instrument will be for each country the Google trend index of all the others countries. By doing so, we satisfy the exclusion restriction of our instrument.

Average Google Trends Index per continent

(for each country this index is calculated relative to China)

As we can observe in the previous graph, two peaks emerge when looking at the Google Trends Index:

  • The first one occurs at the end of January, and is stronger in Asia and Oceania than in the other continents. This period corresponds to the beginning of the awareness of the epidemic occurring in China. It seems logical that the neighbour countries were more concerned about it. Moreover, most of these countries have also been impacted by the SRAS virus between 2002-2003.

  • The second one, which is way stronger, occurs at the beginning of March. This second peak corresponds to the implementation of strict containment policies, in Europe and in the rest of the world.

After this second peak, the Google Trends Index for the search term "Coronavirus" appear to decrease in all the continents, with some smaller peaks corresponding to lockdown extensions.

The results of the first stage estimation (assessing the validity of our instrument) will be part of our estimation strategy.

More to come ...

References

Staiger, D., & Stock, J. H. (1994). Instrumental variables regression with weak instruments (No. t0151). National Bureau of Economic Research.

Angrist, J. D., Imbens, G. W., & Rubin, D. B. (1996). Identification of causal effects using instrumental variables. Journal of the American statistical Association, 91(434), 444-455.

Choi, H., & Varian, H. (2012). Predicting the present with Google Trends. Economic record, 88, 2-9.

Carneiro, H. A., & Mylonakis, E. (2009). Google trends: a web-based tool for real-time surveillance of disease outbreaks. Clinical infectious diseases, 49(10), 1557-1564.