Programme

First Session (Chair: Riccardo Passeggeri)

9:10 - 9:20 Welcome and Opening

9:20 - 10:00 Giulia Livieri: Designing Universal Causal Deep Learning Models: The Case of Infinite-Dimensional Dynamical Systems from Stochastic Analysis

10:00 - 10:40 Rustam Ibragimov: New Approaches to Robust Inference on Market (Non-)Efficiency, Volatility Clustering and Nonlinear Dependence

10:40 - 11:20 Valentina Corradi: Predictive Ability with Possibly Overlapping Models

11:20 - 11:40 Coffee Break


Second Session (Chair: Yanbo Tang)

11:40 - 12:20 Heather Battey: Inducement of population sparsity

12:20 - 13:00 Daniele Bianchi: Dynamic sparse regression models


13:00 - 14:00 Lunch


Third Session (Chair: Silvia Sarpietro)

14:00 - 14:40 Svetlana Bryzgalova: Missing Financial Data

14:40 - 15:20 Paolo Zaffaroni: Frequency-band estimation of the number of factors detecting the main business cycle shocks

15:20 - 15:30 Small Coffee Break

15:30 - 16:10 Martin Weidner: Moment Conditions for Dynamic Panel Logit Models with Fixed Effects

16:10 - 16:50 Andrew Harvey: Co-integration and control: assessing the impact of events using time series data

16:50 - 17:10 Coffee Break


Fourth Session (Chair: Francesco Sanna Passino)

17:10 - 17:50 Eric Renault: Efficient estimation of regression models with user-specified parametric model for heteroskedasticity

17:50 - 18:30 Alessandra Luati: On the optimality of score-driven model

18:30 - 19:10 Hao Ma: Conditional Latent Factor Models Via Econometrics-Based Neural Networks

19:10 - 19:15 Closing Remarks



Abstracts


Giulia Livieri: Designing Universal Causal Deep Learning Models: The Case of Infinite-Dimensional Dynamical Systems from Stochastic Analysis

Deep learning (DL) is becoming indispensable to contemporary stochastic analysis and finance; nevertheless, it is still unclear how to design a principled DL framework for approximating infinite-dimensional causal operators. This paper proposes a "geometry-aware" solution to this open problem by introducing a DL model-design framework that takes a suitable infinite-dimensional linear metric spaces as inputs and returns a universal sequential DL models adapted to these linear geometries: we call these models Causal Neural Operators (CNO). Our main result states that the models produced by our framework can uniformly approximate on compact sets and across arbitrarily finite-time horizons Hölder or smooth trace class operators which causally map sequences between given linear metric spaces. Consequentially, we deduce that a single CNO can efficiently approximate the solution operator to a broad range of SDEs, thus allowing us to simultaneously approximate predictions from families of SDE models, which is vital to computational robust finance. We deduce that the CNO can approximate the solution operator to most stochastic filtering problems, implying that a single CNO can simultaneously filter a family of partially observed stochastic volatility models. (Joint with Luca Galimberti and Anastasis Kratsios)


Rustam Ibragimov: New Approaches to Robust Inference on Market (Non-)Efficiency, Volatility Clustering and Nonlinear Dependence

We present novel, robust methods for inference on market (non-)efficiency, volatility clustering, and nonlinear dependence in financial return series. In contrast to existing methodology, our proposed methods are robust against non-linear dynamics and tail-heaviness of returns. Specifically, our methods only rely on return processes being stationary and weakly dependent (mixing) with finite moments of a suitable order. This includes robustness against power law distributions associated with non-linear dynamic models such as GARCH and stochastic volatility.   The methods are easy to implement and perform well in realistic settings. We revisit a recent study by Baltussen et al. (2019, Journal of Financial Economics, vol. 132, pp. 26-48) on autocorrelation in major stock indexes. Using our robust methods, we document that the evidence of presence of negative autocorrelation is weaker, compared to the conclusions of the original study.


Valentina Corradi: Predictive Ability with Possibly Overlapping Models

This paper provides novel tests for comparing out-of-sample predictive ability of two or more competing models that are possibly overlapping. The tests do not require pre-testing, they allow for dynamic misspecification and are valid under different estimation schemes and loss functions. In pairwise model comparisons, the test is constructed by adding a random perturbation to both the numerator and denominator of a standard Diebold-Mariano test statistic. This prevents degeneracy in the presence of overlapping models but becomes asymptotically negligible otherwise. The test has correct size uniformly over all null data generating processes. A similar idea is used to develop a superior predictive ability test for the comparison of multiple models against a benchmark. Monte Carlo simulations show that our tests have accurate finite sample rejection rates. Finally, an application to forecasting U.S. excess bond returns provides evidence in favour of models using macroeconomic factors. (Joint with Jack Fosten and Daniel Gutknecht)


Heather Battey: Inducement of population sparsity.

The work on parameter orthogonalisation by Cox and Reid (1987) is presented as inducement of population-level sparsity. The latter is taken as a unifying theme for the talk, in which sparsity-inducing parameterisations or data transformations are sought. Three recent examples are framed in this light: sparse parameterisations of covariance models; construction of factorisable transformations for the elimination of nuisance parameters; and inference in high-dimensional regression. The solution strategy for the problem of exact or approximate sparsity inducement appears to be context specific and may entail, for instance, solving one or more partial differential equation, or specifying a parameterised path through transformation or parameterisation space.


Daniele Bianchi: Dynamic sparse regression models

We develop a novel Bayesian inference approach for dynamic variables selection in large-scale time-varying regression models. Specifically, we propose an efficient variational Bayes algorithm which concentrate the posterior density around the true time-varying regression parameters, so that potentially different subsets of active predictors can be identified over time. An extensive simulation study provides evidence that our approach produces more accurate inference compared to existing static and dynamic variables selection methods. This holds irrespective of the sparsity assumption on the underlying data generating process. We empirically evaluate the accuracy of point and density forecasts within the context of two common problems in macroeconomics and finance: inflation forecasting and stock returns predictability. The results show that more accurate estimates translate into substantial out-of-sample gains compared to benchmark Bayesian methods. In addition, in-sample parameter estimates highlight the importance of dynamic sparsity to fully capture the extent of both inflation and stock returns predictability.


Svetlana Bryzgalova: Missing Financial Data

We document the widespread nature and structure of missing observations of firm fundamentals and show how to systematically deal with them. Missing financial data affects over 70% of firms that represent about half of the total market cap. Firm fundamentals have complex systematic missing patterns, invalidating traditional ad-hoc approaches to imputation. We propose a novel imputation method to obtain a fully observed panel of firm fundamentals, that exploits both time-series and cross-sectional dependency of data to impute their missing values, and allows for general systematic patterns of missingness. We document important implications for risk premia estimates, cross-sectional anomalies, and portfolio construction. (Joint with Sven Lerner, Martin Lettau, and Markus Pelger) 


Paolo Zaffaroni: Frequency-band estimation of the number of factors detecting the main business cycle shocks

We introduce a consistent estimator for the number of shocks driving large-dimensional factor models. Our estimator can be applied to single frequencies as  well as to specific frequency bands, making it suitable for disentangling shocks affecting dynamic macroeconomic models with a factor model representation. Its small-sample performance in simulations is excellent, even in estimating the number of shocks that drive medium-sized DSGE models. We apply our estimator to the FRED-QD dataset, finding that the U.S. macroeconomy is driven by two 17 shocks: an inflationary demand shock and a deflationary supply shock.


Martin Weidner: Moment Conditions for Dynamic Panel Logit Models with Fixed Effects

This paper investigates the construction of moment conditions in discrete choice panel data with individual specific fixed effects. We describe how to systematically explore the existence of moment conditions that do not depend on the fixed effects, and we demonstrate how to construct them when they exist. Our approach is closely related to the numerical "functional differencing" construction in Bonhomme (2012), but our emphasis is to find explicit analytic expressions for the moment functions. We first explain the construction and give examples of such moment conditions in various models. Then, we focus on the dynamic binary choice logit model and explore the implications of the moment conditions for identification and estimation of the model parameters that are common to all individuals. (Joint with Bo E. Honoré)


Andrew Harvey: Co-integration and control: assessing the impact of events using time series data

Control groups can provide counterfactual evidence for assessing the impact of an event or policy change on a target variable. We argue that fitting a multivariate time series model offers potential gains over a direct comparison between the target and a weighted average of controls. More importantly, it highlights the assumptions underlying methods such as difference-in-differences and synthetic control, suggesting ways to test these assumptions. Gains from simple and transparent time series models are analysed using examples from the literature, including the California smoking law of 1989 and German re-unification. We argue that selecting controls using a time series strategy is preferable to existing data-driven regression methods. (Joint with Stephen Thiele)


Eric Renault: Efficient estimation of regression models with user-specified parametric model for heteroskedasticity

Several recent papers propose methods to estimate regression (conditional mean) parameters at least as precisely as the ordinary least squares (OLS) and parametric weighted LS (WLS) estimators even when the parametric model for the conditional variance of the regression error is misspecified. Beyond WLS, the literature also proposes adaptive extensions, possibly combining OLS and WLS, and even Machine Learning approaches to estimate the skedastic function. We question this literature by arguing that, similarly to Targeted Maximum Likelihood Learning (TMLL) proposed by van der Laan and Rubin (2006), the WLS estimator was targeted to be based on a good estimator of the skedastic function and might therefore result in a poor estimator of a particular regression parameter. This is reminiscent of TMML based on the idea that “the density estimator was targeted to be a good estimator of the density and might therefore result in a poor estimator of a particular smooth function of the density”. In this article, we propose a weighted least squares based on a targeted estimation principle (TWLS) as well as suitably targeted convex combinations of OLS and TWLS. We show that these targeted approaches outperform (as far as estimation of the target parameter is concerned) all the previous estimators based on standard WLS (included with weights estimated by Machine Learning) or adaptive combinations of OLS and WLS. We also demonstrate the same superior performance using simulations under the same designs used in these recent papers (in particular the various DGPs and user’s specific heteroskedastic models studied by Romano and Wolf (2017)). This principle of targeted estimation, without our adaptation, dates to the early research on optimal design of experiments and indeed to Cragg (1992) in the context of WLS, and has also been gainfully used in the recent literatures on doubly robust estimation and regression discontinuity design. Our paper is valid for any model defined by conditional expectations, included Non-Linear and/or Instrumental Variables regression, and any target for estimation defined by a smooth function of regression parameters. (Joint with Saraswata Chaudhuri)


Alessandra Luati: On the optimality of score-driven model

Score-driven models have been recently introduced as a general framework to specify time-varying parameters of conditional densities. The score enjoys stochastic properties that make these models easy to implement and convenient to apply in several contexts, ranging from bio- statistics to finance. Score-driven parameter updates have been shown to be optimal in terms of locally reducing a local version of the Kullback–Leibler divergence between the true conditional density and the postulated density of the model. A key limitation of such optimality property is that it holds only locally both in the parameter space and sample space, yielding to a definition of local Kullback–Leibler divergence that is in fact not a divergence measure. The current paper shows that score-driven updates satisfy stronger optimality properties that are based on a global definition of Kullback–Leibler divergence. In particular, it is shown that score-driven updates reduce the distance between the expected updated parameter and the pseudo-true parameter. Furthermore, depending on the conditional density and the scaling of the score, the optimality result can hold globally over the parameter space, which can be viewed as a generalisation of the mono- tonicity property of the stochastic gradient descent scheme. Several examples illustrate how the results derived in the paper apply to specific models under different easy-to-check assumptions, and provide a formal method to select the link-function and the scaling of the score. (Joint with Paolo Gorgi (VU Amsterdam) and Sacha Lauria (University of Bologna))


Hao Ma: Conditional Latent Factor Models Via Econometrics-Based Neural Networks

I develop a hybrid methodology that incorporates an econometric identification strategy into artificial neural networks when studying conditional latent factor models. The time-varying betas are assumed to be unknown functions of numerous firm characteristics, and the statistical factors are population cross-sectional OLS estimators for given beta values. Hence, identifying betas and factors boils down to identifying only the function of betas, which is equivalent to solving a constrained optimization problem. For estimation, I construct neural networks customized to solve the constrained optimization problem, which gives a feasible non-parametric estimator for the function of betas. Empirically, I conduct my analysis on a large unbalanced panel of monthly data on US individual stocks with around 30, 000 firms, 516 months, and 94 characteristics. I find that 1) the hybrid method outperforms the benchmark econometric method and the neural networks method in terms of explaining out-of-sample return variation, 2) betas are highly non-linear in firm characteristics, 3) two conditional factors explain over 95% variation of the factor space, and 4) hybrid methods with literature-based characteristics (e.g., book-to-market ratio) outperform ones with COMPUSTAT raw features (e.g., book value and market value), emphasizing the value of academic knowledge from an angle of Man vs. Machine.