How to submit

We'll accept both ARR-reviewed papers and direct workshop submissions. Please check back for more info and the submission link closer to the deadline.

Workshop Paper Tracks

We invite two types of original contributions: empirical papers and position papers that can both be submitted to two workshop tracks:

Challenge Track: Papers from participants in Adversarial Nibbler Challenge describing their experiences and red-teaming approaches
Workshop Topics Track: Empirical and position papers related to workshop topics

All submissions will be peer reviewed by the workshop Program Committee.

ARR submission

If you would like to commit a paper that already has reviews through the ACL Rolling Review process, you need to email artofsafetyworkshop@gmail.com with the following information: (i) Which track you are submitting to; (ii) The unique url for your paper with reviews (https://openreview.net/forum?id=XXXXXXXXXXX). In order to use this submission option, all reviews must be complete before the submission deadline.

Track 1: Adversarial Challenge Track

Participants in the Adversarial Nibbler challenge can submit their results, reflections and insights to this track. We invite papers of 2-4pages in length.

Organizers of the workshop are hosting an adversarial data challenge called Adversarial Nibbler, which is a data-centric AI hackathon for discovering a diverse set of safety vulnerabilities (i.e. adversarial examples) in current state-of-the-art Text-to-Image (T2I) models that can ultimately help improve their safety. A typical bottleneck in safety evaluation is achieving a wide coverage of different types of challenging examples in the evaluation set, i.e., identifying “unknown unknowns” or long-tail problems. All datasets collected during the challenge will be made publicly available under a CC-BY-SA license with which we aim to facilitate model training, optimization, and safety evaluation and provide greater awareness of these issues and assist developers in improving the future safety and reliability of generative AI models. While this challenge interrogates Text-to-Image models, the primary target for participants is on the text component of the system, i.e., finding text prompts that look safe and pass by safety filters, but nonetheless cause models to generate unsafe images.

The challenge is supported by Kaggle and MLCommons. It is hosted on the Dynabench platform and is part of the DataPerf challenge suite.

Check Parrish et al., 2023 for detailed description of the challenge.
Join the challenge at: https://www.dataperf.org/adversarial-nibbler (NOTE, you don't need ML or CS background to participate. The challenge has been designed to be accessible to a wide range of researchers and developers with and without a traditional AI/ML background.)

Submit via softconf here!

Track 2: Workshop Topics Track

Regular workshop papers can be submitted to this track. We invite two types of original contributions: empirical papers and position papers related to workshop topics:

In empirical papers (4 pages), authors are invited to share novel findings, preliminary results, and post-hoc analyses.
In position papers (2 pages), authors can offer new perspectives, ideas or theoretical comments that argue for challenges, benefits, best practices, and strategies for the study of red teaming and adversarial testing.

The main goal of both types of papers is to offer arguments and cases for discussions among the presenters to probe the concepts and interplay of the presented work and positions.

Submit via softconf here!

Formatting requirements

All submissions to both tracks should be in English. All papers should be between 2 pages (position papers) and 4 (empirical papers) excluding references & appendix and formatted in the main proceedings style and submitted via softconf. Submissions should use the most recent ARR paper submission format.