Organization: Sapienza University of Rome
Abstract
Recently, Machine learning tools, in particular graph neural networks, have largely been applied to study hard optimization problems. Many claims of superiority with respect to standard algorithms have been made. However, these claims are at risk of not being solid enough. Indeed, they lack standard benchmarks based on really hard problems. We look at this issue from a statistical mechanics point of view, focusing on random instances. We release our benchmark, together with the performances reached by standard and graph neural network algorithms, to set up a fair comparison between them. We conclude by presenting the challenges that neural networks can face in solving these problems.
Biography
Maria Chiara Angelini is a Researcher in Theoretical Physics at Sapienza University of Rome. Her research specializes in the physics of disordered systems, including spin glasses and structural glasses, which she investigates primarily using advanced renormalization group techniques. She expertly applies powerful statistical physics methods, such as replica and cavity methods, to solve complex optimization and inference problems, with a particular focus on understanding and benchmarking algorithmic performance. She successfully exported these analytical and numerical frameworks to interdisciplinary fields like theoretical ecology. Angelini is the author of numerous publications in high-impact journals and is a regularly invited speaker at international conferences. Her work is distinguished by a synergy of profound theoretical insight and the development of cutting-edge numerical algorithms.
Organization: Huawei
Abstract
While experimental research on Large Language Models (LLMs) is progressing rapidly, understanding their inner workings remains a theoretical challenge. In this talk, we open the "black box" of LLMs by introducing a Semantic Information Theory. Unlike traditional theories based on bits, our framework posits the token as the fundamental unit.
We will explore how principles like rate-distortion theory and Granger causality apply to LLMs, defining structure-agnostic measures for every stage: from the directed rate-distortion function in pre-training to the semantic information flow during inference. Furthermore, we will present a general definition of autoregressive LLMs, showing how to theoretically derive the performance bounds (such as ELBO and memory capacity) of architectures like Transformers, Mamba, and LLaDA. This presentation aims to provide the necessary theoretical tools to deeply understand the semantic principles behind LLMs.
Biography
Bo Bai received the Ph.D. degree in the Department of Electronic Engineering from Tsinghua University, Beijing China, 2010. He was a Research Associate from 2010 to 2012 with the Department of ECE, HKUST. From July 2012 to January 2017, he was an Assistant Professor with the Department of Electronic Engineering, Tsinghua University. Currently, he is an Information Theory Scientist and Director of Theory Lab, Huawei, Hong Kong. His research interests include classic and post-Shannon information theory, B5G/6G mobile networking, and graph informatics. He is an IEEE Senior Member, and has authored more than 130 papers in major IEEE and ACM journals and conferences. He was a recipient of the Best Paper Award in IEEE ICC 2016.
Organization: International Centre for Theoretical Physics
Abstract
For three decades statistical physics has been providing a framework to analyse neural networks. A long-standing question remained on its capacity to tackle deep learning models capturing rich feature learning effects, thus going beyond the narrow networks or kernel methods analysed until now. We positively answer through the study of the supervised learning of a multi-layer perceptron. Importantly, (i) its width scales as the input dimension, making it more prone to feature learning than ultra wide networks, and more expressive than narrow ones or with fixed embedding layers; and (ii) we focus on the challenging interpolation regime where the number of trainable parametersand data are comparable, which forces the model to adapt to the task. We consider the matched teacher-student setting. It provides the fundamental limits of learning random deep neural network targets and helps in identifying the sufficient statistics describing what is learnt by an optimally trained network as the data budget increases. A rich phenomenology emerges with various learning transitions. With enough data optimal performance is attained through model’s “specialisation” towards the target, but it can be hard to reach for training algorithms which get attracted by sub-optimal solutions. Specialisation occurs inhomogeneously across layers, propagating from shallow towards deep ones, but also across neurons in each layer. Furthermore, deeper targets are harder to learn. Despite its simplicity, the Bayesian-optimal setting providesinsights on how the depth, non-linearity and finite (proportional) width influence neural networks in the feature learning regime that are potentially relevant way beyond it.
REF: Statistical physics of deep learning: Optimal learning of a multi-layer perceptron near interpolation (ArXiv 2510.24616)
Biography
Jean Barbier is Associate Professor at the International Centre for Theoretical Physics (ICTP) in Trieste, working on the mathematical physics of information processing systems, high-dimensional inference and the theory of neural networks. Thanks to a grant from the European Research Council, the CHORAL team is currently developing novel statistical tools to better quantify the performance of deep neural networks trained from structured data, through a combinations of random matrix theory, statistical mechanics and information theory.
Organization: Bocconi University
Abstract
Temporal rescaling of sequential neural activity has been observed in multiple brain areas during behaviors involving motor execution at variable speeds. Temporally asymmetric Hebbian rules have been used in network models to learn and retrieve sequential activity, with characteristics that are qualitatively consistent with experimental observations. However, in these models sequential activity is retrieved at a fixed speed. Here, we investigate the effects of a heterogeneity of plasticity rules on network dynamics. In a model in which neurons differ by the degree of temporal symmetry of their plasticity rule, we find that retrieval speed can be controlled by varying external inputs to the network. Neurons with temporally symmetric plasticity rules act as brakes and tend to slow down the dynamics, while neurons with temporally asymmetric rules act as accelerators of the dynamics. We also find that such networks can naturally generate separate 'preparatory' and 'execution' activity patterns with appropriate external inputs.
Biography
Nicolas Brunel received his PhD in physics from the Pierre and Marie Curie University (Paris, France) in 1993, laying the foundation for a pioneering career. Thirty years later, his research articles have been cited over 18,000 times and he has given over 130 invited talks at international conferences and schools. Now, Brunel receives the Valentin Braitenberg Award for Computational Neuroscience letting him feel “incredibly honored, and frankly a little incredulous,” as he said after the news about the award reached him.
Organization: Great Bay University
Abstract
The interplay between uncertainty and nonlinearity in dynamical systems gives rise to rich and complex behaviors, including critical transitions between qualitatively distinct dynamical regimes. These transitions—often referred to as tipping phenomena—are pervasive in neural and other biological systems.
In this talk, we review recent advances in detecting critical transitions between metastable states by analyzing system trajectories, probability distributions, Onsager–Machlup actions, stochastic optimal transport, and dynamical Schrödinger bridges. Building on these developments, we introduce a unified framework of topological, probabilistic, and analytical indicators for anticipating transitions and revealing selection mechanisms in stochastic dynamical systems. Finally, we illustrate the effectiveness of these indicators through examples drawn from neural systems and other biophysical systems.
Biography
Jinqiao Duan works in dynamical systems, especially stochastic dynamical systems and dynamical behaviours of stochastic partial differential equations, stochastic dynamics and data science, and applications in biophysical & geophysical systems. His books include `An Introduction to Stochastic Dynamics' (Cambridge University Press, 2015), and `Effective Dynamics of Stochastic Partial Differential Equations' (with Wei Wang, Elsevier, 2014).
Jinqiao Duan's particular contributions include a random invariant manifold framework, effective reduction and approximation of stochastic dynamics, quantifying non-Gaussian stochastic dynamics by nonlocal partial differential equations, a nonlocal Kramers-Moyal formula, non-Gaussian data assimilation, Onsager-Machlup action functionals and the most probable transitions between metastable states for stochastic dynamical systems (especially under non-Gaussian Levy fluctuations).
Organization: Massachusetts Institute of Technology
Abstract
Many optimization problems involving randomness exhibit a gap between the optimal values and the values achievable by known fast algorithms, dubbed as optimization-to-computation gap. For many of such problems the best known algorithms are online: the decisions are made sequentially based on partially revealing information about the problem input, step by step.
We will exhibit two problem examples where we prove the following: subject to such online restrictions, the known algorithms are the best. The examples are the largest submatrix of a random matrix problem, and the symmetric binary perceptron problem. The proof is based on establishing that these examples satisfy the Overlap Gap Property (OGP), a property deeply inspired by spin glasses. Previously OGP was established as barrier to stable algorithms. While the best algorithms for our two examples are not stable, they are online, and thus using OGP we obtain a tight characterization of the performance of online algorithms for these examples.
Biography
David Gamarnik is a Nanyang Technological University Professor of Operations Research at the Operations Research and Statistics Group, Sloan School of Management of Massachusetts Institute of Technology. He received B.A. in mathematics from New York University in 1993 and Ph.D. in Operations Research from MIT in 1998. Since then he was a research staff member of IBM T.J. Watson Research Center, before joining MIT in 2005.
His research interests include discrete probability, optimization and algorithms, quantum computing, statistics and machine learning, stochastic processes and queueing theory. He's a fellow of the American Mathematical Society, the Institute for Mathematical Statistics and the Institute for Operations Research and Management Science. He's a recipient of the Erlang Prize and the Best Publication Award from the Applied Probability Society of INFORMS, and was a finalist in Franz Edelman Prize competition of INFORMS. He has co-authored a textbook on queueing theory. Currently he serve as an area editor for the Mathematics of Operations Research journal. In the past he has served as an area editor of the Operations Research journal, and as an associate editor of the Mathematics of Operations Research, the Annals of Applied Probability, Queueing Systems and the Stochastic Systems journals.
Organization: Emeritus Professor
Abstract
Learning in deep networks is notoriously slow. Perhaps understanding why would help us to make networks that learn faster. In this work we have investigated learning dynamics quantitatively and found that it shares key features with that of mean-field spin glass models. We have studied multilayer models with recurrent interactions within layers, trained using stochastic gradient descent. We have focused on the region in parameter space near the transition to perfect learning of the training data. In a network of given depth, this transition occurs at a critical value of the layer size (“network width”) w. Approaching the transition from the overparametrized side, we find critical slowing down: the learning time required to reach a given (small) error level delta varies like 1/(w-w_c(delta)). This behaviour in the w-delta plane is like that near the AT line in the SK or p-spin models in a field. We find a power-law approach to the minimum error value at long times in a wide critical region both above and below w_c(0). In this region, the long-time dynamics exhibit aging like that in p-spin models: In a system that has been learning for a time t_w, the fluctuations of the weight values at a later time t_w + t are well-described by a single function of t/t_w. This function has a power-law form (in agreement with p-spin glass theory) at small t/t_w and crosses over to slower growth at very large t/t_w. The very large t/t_w data are consistent with the logarithmic growth found by Cugliandolo and Le Doussal for a model of a particle diffusing in a correlated random potential in infinite dimensions. We conjecture that that model may capture the essential dynamical features of learning in large networks.
Work done with Joanna Tyrcha, Stockholm University.
Biography
John Hertz is an emeritus professor at the Nordic Institute of Theoretical Physics (NORDITA), an institute hosted by Stockholm University and the KTH Royal Institute of Technology, Sweden, as well as at the Niels Bohr Institute at the University of Copenhagen, Denmark. He received his Ph.D. from the University of Pennsylvania and worked at the University of Cambridge, UK, and the University of Chicago before moving to NORDITA in 1980. He has worked on problems in condensed matter and statistical physics, notably itinerant-electron magnetism, spin glasses, and artificial neural networks. In recent decades he has worked primarily in theoretical neuroscience, focusing particularly on cortical circuit dynamics and network inference.
Organization: École Polytechnique Fédérale de Lausanne
Abstract
In this talk, I will revisit the classical committee machine through the lens of modern high-dimensional asymptotics and over-parameterized learning. I will present recent results on empirical risk minimization in wide two-layer networks trained on synthetic Gaussian data, focusing on the interplay between spectral structure, inductive bias, and scaling laws. I will first show how quadratic two-layer networks admit sharp asymptotic formulas and an implicit bias toward sparse multi-index recovery, equivalent to a convex matrix sensing problem with an effective nuclear-norm penalty. I will then present recent progress on scaling laws and spectral properties in the feature-learning regime, where connections to matrix compressed sensing and LASSO yield a phase diagram of excess-risk exponents and link distinct scaling regimes to characteristic weight-spectrum signatures, including power-law tails. Finally, I will briefly explain how deep architectures trained on hierarchical Gaussian targets exploit compositional structure to achieve improved sample complexity. Together, these results outline a unified asymptotic framework connecting solutions, spectra, and scaling laws in modern over-parameterized networks.
Biography
Florent Krzakala is a full professor at École polytechnique fédérale de Lausanne in Switzerland. His research interests include Statistical Physics, Machine Learning, Statistics, Signal Processing, Computer Science and Computational Optics. He leads the IdePHIcs “Information, Learning and Physics" laboratory in the Physics and Electric Engineering departments in EPFL. He is also the founder and scientific advisor of the startup Lighton.
Organization: Bocconi University
Abstract
This talk will describe the present status of theoretical studies on generative diffusion based on statistical physics, focusing on the important question of memorization versus generalization.
Biography
After starting his research career at the CNRS (French National Centre for Scientific Research) in 1981, Marc Mézard became a Research Director in 1990. He initially worked at the Laboratory of Statistical Physics of the École normale supérieure before joining the Laboratory of Theoretical Physics and Statistical Models (LPTMS) at Paris-Sud University in 2001.
He taught at the École Polytechnique from 1987 to 2012. In 2012, he was appointed Director of the École normale supérieure in Paris, a position he held for ten years. Since 2022, he has been a professor in the Department of Computational Sciences at Bocconi University in Milan and a member of several international scientific bodies.
Marc Mézard is a graduate of the École normale supérieure (class of 1976) and holds a State Doctorate in statistical physics. A specialist in disordered systems, he has published over 170 scientific articles and co-authored two reference books, Spin Glass Theory and Beyond (1987) and Information, Physics and Computation (2009). He has received numerous awards, including the CNRS Silver Medal, the Ampère Prize, the Humboldt-Gay-Lussac Prize, the Lars Onsager Prize, and the Three Physicists Prize.
Organization: École Normale Supérieure
Abstract
Despite the unsustainable growth in energy consumption by artificial intelligence models and the recognition of the major role played by metabolic constraints in brain evolution, the relationship between computation and energy remains insufficiently studied and understood. Recently, Padamsey et al. experimentally investigated this relationship in the context of visual information processing in food-deprived mice. Combining analysis of their activity data and modeling inspired by statistical physics, I will propose some mechanism by which neural circuits can spare considerable energy with little impact on their performance.
Biography
Remi Monasson is a theoretical physicist, working on the theory, development and applications of statistical physics methods to study computational problems in inference/machine learning and in computational neuroscience. R. Monasson has authored 126 publications (h-index 41, 6550 citations). R.M. was awarded the Bronze Medal from CNRS in 1997, the Lecomte prize from the Académie des Sciences in 2004, and the Langevin prize from the French Physical Society in 2010; he was senior member at the Center for Systems Biology, Institute for Advanced Study in Princeton from 2009 to 2011.
Organization: Sapienza University of Rome
Giorgio Parisi graduated from Rome University in 1970, the supervisor being Nicola Cabibbo. He has worked as researcher at the Laboratori Nazionali di Frascati from 1971 to 1981. In this period he has been in leave of absence from Frascati at the Columbia University, New York (1973-1974), at the Institute des Hautes Etudes Scientifiques (1976-1977) and at the Ecole Normale Superieure, Paris (1977-1978). He became full professor at Rome University in 1981, from 1981 he was to 1992 full professor of Theoretical Physics at the University of Roma II, Tor Vergata and he is now professor of Quantum Theories at the University of Rome I, La Sapienza.
He received the Feltrinelli prize for physics from the Academia dei Lincei in 1986, the Boltzmann medal in 1992, the Italgas prize in 1993, the Dirac medal and prize in 1999. In 1987 he became correspondent fellow of the Accademia dei Lincei and fellow in 1992; he is also fellow of the French Academy from 1993. He gave in 1986 the Loeb Lectures at Harvard University, in 1987 the Fermi lectures at the Scuola Normale (Pisa) in 1993 the Celsius lectures at Upsala University. He is (or he has been) member of the editorial board of various reviews (Nuclear Physics Field Theory and Statistical Mechanics, Communications in Mathematical Physics, Journal of Statistical Mechanics, Europhysics Letters, International Journal of Physics, Il Nuovo Cimento, Networks, Journal de Physique, Physica A, Physical Review E) and of the scientific committees of the Institute des Hautes Etudes Scientifiques, of the Ecole Normale Superieure (Physique), of the Scuola Normale (Pisa), of the Human Frontiers Science Program Organization, of scientific committee of the INFM and of the French National Research Panel and head of the Italian delegation at the IUPAP.
Giorgio Parisi has written about 350 scientific publications on reviews and about 50 contributions to congresses or schools. His main activity has been in the field of elementary particles, theory of phase transitions and statistical mechanics , mathematical physics and string theory, disordered systems (spin glasses and complex systems), neural networks theoretical immunology, computers and very large scale simulations of QCD (the APE project), non equilibrium statistical physics. Giorgio Parisi has also written three books: Statistical Field Theory, (Addison Wesley, New York, 1988),Spin glass theory and beyond (Word Scientific, Singapore, 1988), in collaboration with M. Mezard and M.A. Virasoro and Field Theory, Disorder and Simulations (Word Scientific, Singapore, 1992).
Organization: Complutense University of Madrid
Abstract
Energy-based generative models offer a natural and powerful framework for modelling complex data, as they provide direct access to the explicit probability distribution over all variables of interest, enabling rich and interpretable analyses of the learned features. Despite their long history, however, these models remain underused because they are notoriously difficult to train and to sample from efficiently. In this talk, I will show how training instabilities originate from inaccurate Monte Carlo estimates of the gradient, and how a physical analysis of the evolving free energy landscape along the training trajectory reveals the underlying difficulty: the progressive crossing of multiple second-order phase transitions [1]. Building on these insights, I will present optimized algorithms that enable fast and accurate training [2,3], opening the way for the systematic use of these models as a new generation of powerful inference tools.
[1] D Bachtis, G Biroli, A Decelle, B Seoane, NeurIPS (2024).
[2] N Béreux, A Decelle, C Furtlehner, L Rosset, B Seoane, ICLR (2025).
[3] N Béreux, A Decelle, C Furtlehner, B Seoane, EurIPS PriGM workshop (2025).
Biography
Beatriz Seoane is a theoretical physicist working at the intersection of statistical and computational physics, with a particular focus on disordered systems such as spin and structural glasses, proteins, and neural networks. She has developed optimized Monte Carlo algorithms and exploited custom computing architectures to address the notoriously slow dynamics of these systems. In recent years, her research has expanded toward bioinformatics and machine learning, where she applies concepts and tools from spin-glass physics to energy-based generative models. This work has revealed fundamental phase transitions in their learning dynamics, inspired more efficient training protocols, and enabled interpretable, data-driven inference methods.
She is the author of 54 scientific publications and has been invited to speak at over 30 international conferences and schools. Following postdoctoral appointments at leading European institutions—including a postdoc under the supervision of Nobel laureate Giorgio Parisi—she began her independent research career in 2020 as a junior PI at the Complutense University of Madrid. She subsequently held the Junior Chair in Physics for Machine Learning at Paris-Saclay University before returning to Madrid as a permanent faculty member in the Department of Theoretical Physics, where she now leads projects bridging statistical physics and machine learning.
Organization: Harvard University
Abstract
The journey from spin glasses to neural networks began more than forty years ago with the study of associative memory in shallow recurrent networks. Despite decades of progress, we still lack a satisfactory theoretical model for the continual storage of correlated, semantically rich memories in neural systems. I will describe a recently developed framework for continual storage of episodic memories, each a short paragraph of text, by fine-tuning the parameters of a pretrained LLM, adopting an architecture that mimics the hippocampus–cortex dual-memory system. I will present a spin-glass theory for the system’s capacity, discuss the role of replica symmetry breaking, and elucidate the connection between this long-term memory model and neural networks with attractor manifolds
Biography
Haim Sompolinsky earned his PhD in Physics from Bar-Ilan University, Israel. Currently, he holds positions as Professor of Physics and Neuroscience (Emeritus) at Hebrew University, Israel, and as Professor of Molecular and Cellular Biology and of Physics (in Residence) at Harvard University, USA.
The laboratory led by Haim Sompolinsky employs statistical physics methods to investigate the emergent dynamics and collective behavior of complex neuronal circuits and their relationship to critical brain functions, including learning, memory, perception, and cognition. His theoretical predictions have received experimental support from the study of navigational circuits in fly and rodents. His work has elucidated how the dynamic balance between neuronal excitation and inhibition leads to chaotic yet stable patterns of brain activity. This has influenced our understanding of the origins of variability in neuronal activity, the mechanisms underpinning the stability of neuronal dynamics, and the impact of the disruption of excitation-inhibition balance in neurological diseases.
More recently, Sompolinsky has developed geometric methods that provide a principled approach to the study of information processing in vision and language, in both artificial neural networks and brain circuits. This work has revealed surprising similarities between the two systems and opens a new path for synergetic investigations of intelligence in natural and artificial systems.
Organization: University of Toronto
Abstract
Real-world data often conceal meaningful signals beneath both random and structured noise. Structured noise arises in many forms, from batch effects in biomedical studies to background in image classification. Surprisingly, algorithms that encourage diversity or uniformity in their learned representations often generalize better out of context. To understand this phenomenon, we study linear representation learning with two views, comparing classical and contrastive methods, with and without a uniformity constraint. The classical non-contrastive algorithms break down under structured noise. Contrastive learning with an alignment-only loss performs well when background variation is mild but fails under strong structured noise. In contrast, contrastive learning that enforces a uniformity constraint remains robust regardless of noise strength. Empirical results confirm these insights. Taking one step further, we discuss how to make algorithms that are robust to random noise and to nonstationary dynamics.
Biography
Qiang Sun is currently an associate Professor of Statistics and Computer Science at the University of Toronto (UofT), leading the StatsLE (Statistics, Learning, and Engineering) Lab. He is interested in statistics + AI, with a focus on efficient generative AI (GenAI), trustworthy AI, and the foundation of next-generation statistics. Motivated by challenges in the industrial sector, his interests extend to ensemble learning, transfer learning, and reinforcement learning. He is also interested in AI for tech, finance, and science. In addition to his faculty role, he also serves as an associate editor for Electronic Journal of Statistics (EJS) and as an area chair for ML conferences such as AISTATS, COLT, and UAI.
Prior to his tenure at UofT, he was an associate research scholar at Princeton University. He obtained my PhD from the University of North Carolina at Chapel Hill (UNC-CH) and my BS in SCGY from the University of Science and Technology of China (USTC).
Organization: Sapienza University of Rome
Abstract
The task of sampling efficiently the Gibbs-Boltzmann distribution of disordered systems is important both for the theoretical understanding of these models and for the solution of practical optimization problems. Unfortunately, this task is known to be hard, especially for spin glasses at low temperatures. Recently, many attempts have been made to tackle the problem by mixing classical Monte Carlo schemes with newly devised Neural Networks that learn to propose smart moves. In this talk I will review a few physically-interpretable deep architectures, and in particular one whose number of parameters scales linearly with the size of the system and that can be applied to a large variety of topologies. I will show that these architectures can accurately learn the Gibbs-Boltzmann distribution for the two-dimensional and three-dimensional Edwards-Anderson models, and specifically for some of its most difficult instances. I will show that the performance increases with the number of layers, in a way that clearly connects to the correlation length of the system, thus providing a simple and interpretable criterion to choose the optimal depth. Finally, I will discuss the performances of these architectures in proposing smart Monte Carlo moves and compare them to state-of-the-art algorithms. I will present clear and robust evidence that a machine learning-assisted optimization method can exceed the capabilities of classical state-of-the-art techniques in a combinatorial optimization setting.
Biography
Francesco Zamponi is a theoretical physicist whose research focuses on the statistical mechanics of disordered and complex systems. His work spans the physics of glasses and jamming, spin glasses, and random constraint satisfaction problems, as well as the interplay between disorder, frustration, and emergent collective behavior. More recently, he has applied concepts and methods from disordered systems to biological contexts, in particular to the statistical physics of fitness landscapes and evolution, developing models that connect sequence variability, epistasis, and evolutionary dynamics.
Organization: Bocconi University
Abstract
We show that asymmetric deep recurrent neural networks, enhanced with additional sparse excitatory couplings, give rise to an exponentially large, dense accessible manifold of internal representations which can be found by different algorithms, including simple iterative dynamics.
Building on the geometrical properties of the stable configurations, we propose a distributed learning scheme in which input-output associations emerge naturally from the recurrent dynamics, without any need of gradient evaluation.
A critical feature enabling the learning process is the stability of the configurations reached at convergence, even after removal of the supervisory output signal. Extensive simulations demonstrate that this approach performs competitively on standard AI benchmarks. The model can be generalized in multiple directions, both computational and biological.
Biography
Riccardo Zecchina is a Professor of Theoretical Physics at Bocconi University in Milan, where he holds a Chair in Machine Learning. His current research interests lie at the intersection of statistical physics, computer science, and artificial intelligence. He obtained his PhD in Theoretical Physics from the University of Turin, working under the supervision of Tullio Regge. He then served as a researcher and head of the Statistical Physics group at the International Centre for Theoretical Physics in Trieste (1997–2007), and subsequently as a Full Professor of Theoretical Physics at the Polytechnic University of Turin (2007–2017). In 2017, he moved to Bocconi University in Milan, establishing the Department of Computer Science and creating degree programs in mathematical and computational methods for Artificial Intelligence. He has been a long-term visiting scientist multiple times at Microsoft Research (in Redmond and Cambridge, MA) and at the Laboratory of Theoretical Physics and Statistical Models (LPTMS) of the University of Paris-Sud. In 2016, he was awarded the Lars Onsager Prize in Theoretical Statistical Physics by the American Physical Society, together with M. Mézard and G. Parisi. Previously, he received an ERC Advanced Grant from the European Research Council (2011–2015).