ONE-DAY WORKSHOP ON APPLIED MATHEMATICS 2024

SPEAKER

Gian Paolo Leonardi (Università degli Studi di Trento)

SCHEDULE

16:00 - 17:00

TITLE

A two-scale complexity measure for deep learning models

ABSTRACT

Deep learning models are achieving outstanding performances in addressing complex tasks such as image classification, object detection, and natural language processing. A theoretical explanation of the impressive generalization capabilities of deep neural networks is still largely incomplete, and the design of such tools depends more on the intuition and the experience of practitioners than on solid theoretical guidelines. Therefore, a key issue is the definition of complexity measures that can provide meaningful estimates on the generalization error and can robustly and efficiently evaluate the expressivity of different network topologies, thus enhancing/facilitating the model selection process. In this direction, I will present a new notion of complexity measure, called two-scale effective dimension (2sED), which is a box-covering dimension related to a metric induced by the Fisher information matrix of a parametric statistical model. The 2sED supports a generalization bound and, due to its specific form, can be easily adapted to models of Markovian type (i.e., to stochastic generalizations of feedforward deep neural networks). In particular, the sequential, layer-by-layer type of computation required to evaluate a tight lower bound of the 2sED (called ``lower 2sED'') has reduced computational demands and can be applied to large models. Finally, I present experimental evidence that the post-training performances of parametric models are strongly related to the value of the (lower) 2sED. 

This research is in collaboration with Massimiliano Datres (UniTN), Alessio Figalli (ETHZ), and David Sutter (IBM Center - Zürich).