Schedule & Info
The ART of Safety Workshop will be fully virtual on Nov 1st, 2023.
Check back later for links and attendance details
Tentative schedule
To accommodate the time zone of our participants and speakers, all times are in Eastern Daylight Time (UTC−04:00). Sessions will be recorded for those who cannot attend synchronously.
9:00-9:15 am Welcome
9:15-10:00 am Keynote 1: For large-scale testing, a community is more than a crowd, D. Sculley , Kaggle
10:00-10:15 am Spotlight talk: Distilling Adversarial Prompts from Safety Benchmarks: Report for the Adversarial Nibbler Challenge
Authors: Manuel Brack, Patrick Schramowski and Kristian Kersting
10:15-10:30 am Spotlight talk: Red Teaming for Large Language Models At Scale: Tackling Hallucinations on Mathematics Tasks
Authors: Aleksander Buszydlik, Karol Dobiczek, Michat Teodor Okon, Konrad Skublicki, Philip Lippmann and Jie Yang
10:30-10:45 am Spotlight talk: Student-Teacher Prompting for Red Teamingto Improve Guardrails
Authors: Rodrigo Revilla Llaca, Victoria Leskoschek, Vitor Costa Paiva, Catalin Lupãu, Philip Lippmann and Jie Yang
10:45-11:00 am Break
11:00-11:15 am Spotlight talk: Measuring Adversarial Datasets
Authors: Yuanchen Bai, Raoyi Huang, Vijay Viswanathan, Tzu-Sheng Kuo and Tongshuang Wu
11:15-11:30 am Spotlight talk: Uncovering Bias in AI-Generated Images
Authors: Kimberly Baxter
11:45-12:15 pm Findings from Adversarial Nibbler Challenge
12:15-1:00 pm Lunch
1:00-1:45 pm Keynote 2: Alex Beutel, OpenAI
1:45-2:30 pm Keynote 3: Mind the Gaps: Adversarial Testing in Generative AI, Challenges and Opportunities, Hamid Palangi, Microsoft
2:30-2:45 pm Spotlight talk: Discovering Safety Issues in Text-to-ImageModels: Insights from Adversarial Nibbler Challenge
Authors: Gauri Sharma
2:45-4:00 pm Small and large group discussion
4:00-4:05 pm Closing remarks