Workshop

DATA: at the cross road between Mathematics, Statistics and Economics 

September 7 - 8, 2023

The big amount of Data in our Era is, both, a challenge and a possibility. In this respect, Economics could greatly benefit of it by creating new predictive methods based on data, without the need of strong assumptions on the system under study.

In order to do this, new mathematical models and statistical methods are needed to provide economists with the tools they need to confront this challenge.

The idea of this workshop is to create a bridge between those three disciplines, providing a place for experts to meet, confront, and present their ideas. 


Lecturers:

Abstract: By implementing a simultaneous “reciprocal dictator game” with paired individuals who play between themselves and at the same time the dual role of dictators and recipients, we investigate whether distributional preferences or belief-dependent preferences have a better predictive power. We compare the theoretical predictions of two models of social preferences that can explain behavior in this gift-exchange context: distributional preferences in the form of opportunity-based kindness à la Saito (AER, 2013), and belief-dependent preferences in the form of intention-based kindness à la Dufwenberg and Kirchsteiger (GEB, 2004). By disentangling experimentally the features that lead from the standard dictator game to our reciprocal dictator game (equality of endowments, equality of opportunities, and beliefs on the other’s actions), we test the consistency of the predictions of the two models under each treatment manipulation. Overall, our results support the view that other-regarding behavior in gift-exchange settings is mainly driven by players’ beliefs on the other’s kindness rather than by equality of opportunities of being kind to the other.  

Slides of the talk here


Title: A Hybrid Machine Learning Framework for Tax Revenues monitoring (Eugenio Cangiano, Francesco De Napoli, Sabrina Sabatini, Daria Scacciatelli - (Policy, Forecasts and Statistical Analysis - Sogei))

Link: https://www.sogei.it/it/sogei-homepage/soluzioni/modelli-previsionali-e-analisi-statistiche.html

Abstract: Monitoring the amount of tax revenues with at least monthly frequency is extremely important for assessing the convergence of public finance figures to annual objectives. This task is particularly relevant also because annual budget forecasts, provided by Italian MEF State General Accounting Department at the beginning of every fiscal year, can be subject to frequent revisions, e.g. in case of changes in fiscal policy measures and/or updates of macroeconomic scenarios. Therefore, to assess the impact of such revisions on annual tax revenues forecasts, it is useful to develop a higher-frequency model, which incorporates additional information collected throughout the year. This work proposes a new hybrid machine learning framework (HGB named), based on the Gradient Boosting algorithm, for the prediction of tax revenues. Short-term forecasts are obtained using, as explanatory variables, a large database of macroeconomic time series at monthly frequency. The proposed HGB framework combines feature selection techniques, auto-regressive models and machine learning regression algorithms. Datastream's macroeconomic time-series database was processed by Boruta algorithm in order to reduce data dimensionality. Afterwards, SARIMA predictions of selected features over the forecast horizon were provided to the XGBoost model for deriving tax revenues prediction. The experimental results of HGB framework, applied to excise duty on Mineral Oil (the tax revenues category on which the model was firstly tested), showed high predictive accuracy and a better performance compared to traditional autoregressive models. The framework was evaluated using a k-fold cross validation on a rolling basis.

Slides of the talk here.


Title: Optimal control and Differential Games for Pollution Management and Climate Agreements.

Abstract: Over the last decade, pollution/climate change has become a research focus across several disciplines, but few papers consider the problems of designing international environmental policies in the framework of dynamic optimal control/game theory (typically under strong linearity assumptions). Our papers  a first attempt to study the problem in a more general nonlinear context, while, at the same time, retaining tractability. We expose the model, some ideas for its solution in simple cases, some first insights and some open problems. 

Slides of the talk here


Abstract: Because aspirations impact when managers search for new alternatives, the level of aspirations, and how they change with environmental conditions, has important performance consequences. While having a high aspiration is often useful, past research has shown that prolonged search for a superior alternative may not make sense in turbulent conditions when the profitability of an adopted alternative may quickly change. Using a simple and analytically tractable model of problemistic search, we show that when aspirations are defined in relative terms, i.e. being better than a certain fraction of others, the opposite conclusion holds: a higher aspiration leads to higher performance in settings with more turbulence. These contradictory findings result from the relationship between absolute and relative outcomes generated by social comparison, which is moderated by environmental turbulence: depending on environmental conditions, poor individual outcomes can be good in relative terms, and vice versa. Our study enriches the debate on the aspiration-performance relationship by drawing attention to the relevance of the specification of aspirations. Moreover, our analyses have important managerial implications and inform how targets should be set at both the population and individual level.

Slides of the talk here


Title: Modern methods in volatility estimation

Abstract: I first discuss some different notions of volatility in financial markets. I explain the concept of realised volatility in connection with the notion of quadratic variation, and I explain how it can be estimated from high frequency data. I review the observed properties of volatility estimated on empirical financial time series. I then introduce the concept of fractional and rough volatility and show how this is supported from empirical observations. Finally, I propose a framework to extend the rough volatility model to a multi-asset setting. 


Title: Machine Learning for Zombie Hunting: Predicting Distress from Firms' Accounts and Missing Values 

Abstract: We propose machine learning techniques to predict zombie firms. First, we derive the risk of failure by training and testing our algorithms on disclosed financial information and non-random missing values of 304,906 firms active in Italy from 2008 to 2017. Then, we spot the highest financial distress conditional on predictions that lies above a threshold for which a combination of false positive rate (false prediction of firm failure) and false negative rate (false prediction of active firms) is minimized. Therefore, we identify zombies as firms that persist in a state of financial distress, i.e., their forecasts fall into the risk category above the threshold for at least three consecutive years. For our purpose, we implement a gradient boosting algorithm (XGBoost) that exploits information about missing values. The inclusion of missing values in our predictive model is crucial because patterns of undisclosed accounts are correlated with firm failure. Finally, we show that our preferred machine learning algorithm outperforms (i) proxy models such as Z-scores and the Distance-to-Default, (ii) traditional econometric methods, and (iii) other widely used machine learning techniques. We provide evidence that zombies are on average less productive and smaller, and that they tend to increase in times of crisis. Finally, we argue that our application can help financial institutions and public authorities design evidence-based policies-e.g., optimal bankruptcy laws and information disclosure policies.


Abstract: Whenever an election is run dividing the population into electoral districts, each based on winner-takes-all, the Gerrymandering phenomenon appears. Over 2 centuries ago, Massachusetts' Governor Gerry modified the boundaries of his own electoral district in order to assure his re-election. Indeed, whenever two parties (or candidates) run for an election, if one of them has more than one third of the votes, the electoral districts can be shaped in order to assure its electoral win. Thus a double-faced problem arises: on one hand a method is needed to measure how gerrymandered a district is; on the other hand it would be useful to have an objective method to shape districts and avoiding the possibility of gerrymandering. A great part of the mathematical studies on the problem have been devoted either to geometrical properties of the district or on the application of the theory of isoperimetrical problems in this area. Recently it has been pointed out that it would be more realistic a discrete geometry approach, based on graphs and nets. In this talk I will present a discrete districting plan proposed by Giorgio Saracco and myself and analyse its properties.

Slides of the talk here


Abstract: In the last decades, topological methods are gaining much attention due to their scale-free and robust-to-noise nature. Through topological methods one can focus on topological descriptors to compare and classify data based on their shape. More specifically, the combination of persistent homology and the analysis of corresponding discrete vector fields has proven very effective in visualizing big and heterogeneous data by detecting critical cells and domain segmentations depending on a discrete gradient flow. The specific case of multivariate data still demands for further investigations. The multivariate case requires the detection and interpretation of the possible interdepedence among data components. To this end, we introduce and study a notion of perfectness for discrete gradient vector fields with respect to multi-parameter persistent homology, called relative-perfectness. As a natural generalization of usual perfectness in Morse theory for homology, relative-perfectness entails having the least number of critical cells relevant for multi-parameter persistence. We discuss the computability and efficiency of relative-perfect discrete gradient fields and, in the final part, we present some experimental comparisons to the notion, in the smooth case, of Pareto set. 

Slides of the talk here

Classrooms: All lectures will take place in classroom E4, Campus Einaudi.


Schedule: 

September 7

9:00 - 9:30     Opening and welcoming

9:30 - 10:45  L.  Marengo

10:45 - 11:15  Break

11:15 - 12:30  A. Saracco

12:30 - 14:00 Lunch

14:00 - 15:15 E. Cangiano

15:15 - 15:30 Break

15:30 - 16:45 G. Attanasi

19:30 -  Social Dinner


September 8:

  9:30 - 10:45 S. Scaramuccia

10:45 - 11:15 Break

11:15 - 12:30 M. Riccaboni

12:30 - 14:00 Lunch

14:00 - 15:15 P. Pigato

15:15 - 15:30 Break

15:30 - 16:45 F. Gozzi