Searching for architectures and BERT moments in specialized AI applications
Thursday, May 22, 2025, 2:00PM
Richard Hawryluk Conference Room, B318
Host: Kyle Parfrey
Abstract: In 2018, advances in architecture design and self-supervised learning led to the “BERT moment” in natural language processing, in which supervised learning workflows were permanently supplanted by the pretraining and fine-tuning of massive Transformer models. This spurred scientists in more specialized areas—e.g. genomics, satellite imaging, and time series forecasting—to develop “foundation models” (FMs) of their own. In a broad investigation of over thirty such models on over fifty tasks across three distinct modalities, we find that these specialized FMs still struggle to beat (often much cheaper) supervised learning pipelines. This indicates that the benefits of large-scale pretraining have yet to be fully realized in these domains and that better evaluations need to be developed in order to drive and measure progress. The broad scope of our study is enabled by new methods extending neural architecture search—a technique previously used mainly for vision tasks—to applications in the natural science and engineering.
Sponsorship of an event does not constitute institutional endorsement of external speakers or views presented.