Speaker: Hadi Daneshmand
Date: Feb 7, 2025
Time: 12:15 - 1:45pm
Location: Rice Hall 109
Abstract:
Deep neural networks are often seen as statistical parametric models, but this perspective falls short in fully explaining their capabilities, such as generalization and “in-context learning”. I will talk about an emerging novel viewpoint: deep neural networks can inherently optimize various functions. Linking transformers to optimization methods, I will show that language models can provably sort, and optimal transport in general, with an approximation error that vanishes as depth increases. This computational perspective also provides insights into the effectiveness of prompt engineering and the generalization mechanisms. As background, the talk will include an overview of the optimal transport problem.
Bio:
Hadi joined UVA as an Assistant Professor of Computer Science in December 2024. Previously, he was a FODSI postdoctoral researcher at MIT and Boston University. He earned his PhD in Computer Science from ETH Zurich in 2020. His research focuses on the foundations of machine learning, bridging continuous optimization and deep learning theory. His contributions have been recognized with the Stanford CPAL Rising Star Award, an SNSF Mobility Grant, and a Spotlight Award at the ICML Workshop on In-Context Learning.
Zoom: https://virginia.zoom.us/j/94117879510?pwd=EfbkoNIrAAwZC80K8D06W0fA4Vhzes.1
Meeting ID: 941 1787 9510
Passcode: 893368