Ioannis Chalkiadakis

Title: Statistical Natural Language Processing and Sentiment Analysis with Time-Series and Gaussian Processes

Abstract: This talk will present our novel "sentometrics" (sentiment+econometrics) approach to extracting text-based insights about the cryptocurrency market. We will start with what we believe constitutes challenges in Natural Language Processing for statistics, econometrics and social sciences research, before presenting our sentometrics framework for text-based time-series and sentiment signals construction. We will then proceed to formulate two problem statements positioned in the cryptocurrency space and illustrate how our framework naturally integrates with econometrics models to provide solutions. In the first problem statement, we demonstrate how we can jointly model text and finance data sampled at multiple timescales, while accounting for stylised features - in particular, long memory. In the second problem, we combine our text time-series framework with a very flexible Gaussian Process statistical causality framework developed by Zaremba and Peters, 2022, (https://doi.org/10.1007/s11009-022-09928-3) to extract statistical causality relationships developed in the cryptocurrency space. We will conclude the talk with a brief outlook on our current research.