R0-FoMo: Robustness of Few-shot and Zero-shot Learning in Foundation Models

Dec 15, NeurIPS 2023

Ballroom A/B,  New Orleans Convention Center


Overview

Recent advances in the capabilities of large foundational models have been catalyzed by repurposing pretrained models to domain specific use cases through few-shot learning methods like prompt-tuning, in-context-learning; and zero-shot learning based on task descriptions. Given a few labeled examples that outline a new task [T5, GPT2, T0, DALL-E, CLIP], these large foundational models have demonstrably improved upon previous few-shot learning benchmarks [T-few, LAION]. We are closer than ever to learn from very few examples; and recent works [Frozen, Flamingo] have proposed methods to use large language and vision transformer models directly on these few examples, instead of human annotation to create large datasets for fine-tuning. The lessons learned from past-work in counterfactual reasoning, domain adaptation, meta-learning, continual learning, and adversarial training have to be revisited with a new lens towards improving robustness of few-shot learning methods or learning from no supervision (i.e., unlabeled data) that scale to multiple tasks in a safe and responsible manner. 

In addition to leveraging few-shot learning methods with labeled examples, there is also significant potential in harnessing the power of unlabeled data. When labeled and unlabeled data are from the same distribution, semi-supervised learning methods can be modified to now utilize large foundational models that can further improve boost performance over purely few-shot algorithms.  Furthermore, similar ideas need to be explored for unsupervised domain adaptation,  to improve robustness of fine-tuned methods to distribution shifts when the unlabeled data distribution is much broader than the distribution from which the labeled examples are collected. As we get close to these few-shot methods make huge impact across multiple domains; we want to ask a few important questions:

Evaluating the robustness of few-shot and pre-trained models: What are some of the current patterns of failure when few-shot learning models are deployed? How do we reliably measure coverage of robustness to emergent patterns? How can we build automated tools for evaluating robustness that correlate with real use of the models? What distributional blind-spots do these few-shot learning models have? What are the pitfalls of existing robustness metrics?

Challenges of building Responsible AI using few-shot methods: What are some of the harms perpetuated by few-shot learning methods? How can we anticipate the robustness and safety issues that will arise in the future? How do we build guard-rails that prevent severe harms being perpetuated (e.g. hate speech, pornography, xenophobia, racism, etc)?

Novel methods to improve few-shot robustness: How can we apply domain adaptation methods to overcome robustness in few-shot learning? What is the relationship between sample size of few-shot learning examples and robustness? What are the pitfalls of existing mitigation approaches - including data augmentation, adversarial training and how can they be repurposed?

Reimagining human-in-the-loop: What tools can we build to assist humans to write robust prompts or few-shot examples? How can we communicate uncertainty of these few-shot learning models through reasoning? How do we expand and assist human evaluation methods through auxiliary generative models?

Improving few-shot transfer with unlabeled data: Can we leverage unlabeled data to improve zero-shot or few-shot transfer of large scale models (e.g., GPT3, CLIP)? Are existing domain adaptation/semi-supervised learning methods applicable in the era of large scale pretrained models? 

The goal of this workshop is to bring together machine learning researchers from academia and industry to encourage knowledge transfer and collaboration on these topics to discover ideas that can expand our understanding of robustness of few-shot learning approaches based on large foundational models. The ideal outcome of the workshop is to identify a set of concrete research directions to enable the next generation of robust models that are safe and responsible.


Please contact r0-fomo[ at ]googlegroups[ dot ]com for inquiries.

Invited Speakers/Panelists 

(alphabetical order by lastname)

Anima Anandkumar

Caltech & NVIDIA

Yair Carmon

Tel Aviv University

Srijan Kumar

Georgia Tech

Tian Li

University of Chicago

Partha Talukdar

Google Research

Organizers

Yao Qin

UC Santa Barbara

Reviewing


If you are interested in serving as a reviewer for our workshop please sign up on this form.