[Google Scholar]

Publications in Journals

Wallis J., Azqueta-Gavaldon A., Ananthakumar T., Dürichen R., and Albergante L. 2022. Similarity-based prediction of ejection fraction in heart failure patients. Informatics in Medicine Unlocked, Volume 32:


Abstract: Biomedical research is increasingly employing real world evidence (RWE) to foster discoveries of novel clinical phenotypes and to better characterize long term effect of medical treatments. However, due to limitations inherent in the collection process, RWE often lacks key features of patients, particularly when these features cannot be directly encoded using data standards such as ICD-10. Here, we propose a novel data-driven statistical machine learning approach, named Feature Imputation via Local Likelihood (FILL), designed to infer missing features by exploiting feature similarity between patients. We test our method using a particularly challenging problem: differentiating heart failure patients with reduced versus preserved ejection fraction (HFrEF and HFpEF respectively). The complexity of the task stems from three aspects: the two share many common characteristics and treatments, only part of the relevant diagnoses may have been recorded, and the information on ejection fraction is often missing from RWE datasets. Despite these difficulties, our method is shown to be capable of inferring heart failure patients with HFpEF with a precision above 80% when considering multiple scenarios across two RWE datasets containing 11,950 and 10,051 heart failure patients. This is an improvement when compared to classical approaches such as logistic regression and random forest which were only able to achieve a precision < 73%. Finally, this approach allows us to analyse which features are commonly associated with HFpEF patients. For example, we found that specific diagnostic codes for atrial fibrillation and personal history of long-term use of anticoagulants are often key in identifying HFpEF patients.

Azqueta-Gavaldon A. 2019. Causal inference between cryptocurrency narratives and prices: Evidence from a complex dynamic ecosystem. Physica A: Statistical Mechanics and its Applications, Volume 537:


Abstract: In this note, I explore the causal relationship between narratives propagated by the media and crypto prices. Firstly, I unveil four cryptocurrency-related narratives: investment, technological innovation, security breaches and regulation. Secondly, after acknowledging their tone (sentiment), I apply Convergent Cross Mapping (CCM) to assess the causal relationship between narratives and prices. I find strong bi-directional causal relationships between narratives concerning investment and regulation while a uni-directional causal association exists in narratives relating technology and security to prices. Therefore, this work contributes to the recent economic literature that connects consumer behaviour to narratives .

Abstract: I propose creating a news-based Economic Policy Uncertainty (EPU) index by employing an unsupervised algorithm able to deduce the subject of each article without the need for pre-labeled data. This approach economizes on costly human classification to pre-define a set of keywords.

Data: The time series created by the unsupervised algorithm can be found here.

Other publications

Azqueta-Gavaldon A., Hirschbühl D., Onorante L., and Saiz L. (2020):Nowcasting business cycle turning points with stock networks and machine learning , Working Paper Series 2494 , European Central Bank

Abstract: We propose a granular framework that makes use of advanced statistical methods to approximate developments in economy-wide expected corporate earnings. In particular, we evaluate the dynamic network structure of stock returns in the United States as a proxy for the transmission of shocks through the economy and identify node positions (firms) whose connectedness provides a signal for economic growth. The nowcasting exercise, with both the in-sample and the out-of-sample consistent feature selection, highlights which firms are contemporaneously exposed to aggregate downturns and provides a more complete narrative than is usually provided by more aggregate data. The two-state model for predicting periods of negative growth can remarkably well predict future states by using information derived from the node-positions of manufacturing, transportation and financial (particularly insurance) firms. The three-states model, which identifies high, low and negative growth, successfully predicts economic regimes by making use of information from the financial, insurance, and retail sectors.

Azqueta-Gavaldon A. (2020): Political referenda and investment: evidence from Scotland , Working Paper Series 2403 , European Central Bank

Abstract: We present evidence that referenda have a significant, detrimental outcome on investment. Employing an unsupervised machine learning algorithm over the period 2008- 2017, we construct three important uncertainty indices underlying reports in the Scottish news media: Scottish independence (IndyRef )-related uncertainty; Brexit-related uncertainty; and Scottish policy-related uncertainty. Examining the relationship of these indices with investment on a longitudinal panel of 3,589 Scottish firms, the evidence suggests that Brexit-related uncertainty associates more strongly than IndyRef -related uncertainty to investment. Our preferred specification suggests that a one standarddeviation increase in Brexit uncertainty foreshadows a reduction in investment by 8% on average in the following year. Besides we find that the uncertainty associated with the Scottish referendum for independence while negligible at the aggregate level, relates more strongly with the investment of listed firms as well as those operating on the border with England. In addition, we present evidence of greater sensitivity to these indices among firms that are financially constrained or whose investment is to a greater degree irreversible.

Azqueta-Gavaldon A., Hirschbühl D., Onorante L., and Saiz L. (2020): Sources of economic policy uncertainty in the euro area: an unsupervised machine learning approach, Working Paper Series 2359, European Central Bank (R&R European Economic Review)

Abstract: We model economic policy uncertainty (EPU) in the four largest euro area countries by applying machine learning techniques to news articles. The unsupervised machine learning algorithm used makes it possible to retrieve the individual components of overall EPU endogenously for a wide range of languages. The uncertainty indices computed from January 2000 to May 2019 capture episodes of regulatory change, trade tensions and financial stress. In an evaluation exercise, we use a structural vector autoregression model to study the relationship between different sources of uncertainty and investment in machinery and equipment as a proxy for business investment. We document strong heterogeneity and asymmetries in the relationship between investment and uncertainty across and within countries. For example, while investment in France, Italy and Spain reacts strongly to political uncertainty shocks, in Germany investment is more sensitive to trade uncertainty shocks

Azqueta-Gavaldon A., Hirschbühl D., Onorante L., and Saiz L. (2019): Sources of economic policy uncertainty in the euro area: a machine learning approach, Economic Bulletin Boxes, European Central Bank, vol. 5.

Azqueta-Gavaldon A. 2017: Financial Investment and economic policy uncertainty in the UK. IML '17 Proceedings of the 1st International Conference on Internet of Things and Machine Learning.

Abstract: UK based financial firms following Brexit reported net disinvestment of 15 billion pounds. This was the fifth time financial disinvestment occurred since the production of this data: 1987. Parallel to this event, Economic Policy Uncertainty (EPU) in the UK experienced its biggest rise during Brexit June 2016. This note studies the relationship between EPU and its particular components and financial investment. I find that overall EPU and specifically fiscal policy, monetary policy, geopolitical, regulation and liquidity uncertainty have the highest negative sensitivity to financial investment.

Work in progress

Developing a real estate yield investment device using granular data and machine learning (join with Gonzalo Azqueta-Gavaldon, Monica Azqueta-Gavaldon, and Inigo Azqueta-Gavaldon) [arXiv]

Abstract: This project aims at creating an investment device to help investors determine which real estate units have a higher return to investment in Madrid. The idea is simple: determine what is the rental price of a unit per month and how much would the cost ofthe mortgage be. To do so, we gather data from, a real estate web-page with millions of real estate units across Spain, Italy and Portugal. In this note, wepresent the road map on how we gather the data, descriptive statistics of the 8,121real estate units used (rental and sale); build a return index based on the differencein prices of rental and sale units (per neighborhood and size) and introduce machinelearning algorithms for rental real estate price prediction.