Generative Models for Decision Making

Workshop @ ICLR 2024

May 11, 2024 - Vienna, Austria

Generative Artificial Intelligence (AI) has made significant advancements in recent years, particularly with the development of large language and diffusion models. These generative models have demonstrated impressive capabilities across various domains, such as text [1-3], image, audio, and video [4, 5, 6,7]. Concurrently, decision making has made significant strides in solving complex sequential decision-making problems with the help of external knowledge sources [8-10]. However, there remains untapped potential in combining generative models with decision making algorithms to tackle real-world challenges, particularly to improve sample efficiency of tabula rasa training by introducing priors from related domains such as visual question-answering, image captioning and image generation.

This workshop aims to bring together researchers and practitioners from the fields of generative AI and decision making to explore the latest advances, methodologies, and applications. By fostering collaborations between these two domains, we intend to unlock new opportunities for addressing complex problems that lie at the intersection of both fields.

The workshop will cover a wide range of topics, including but not limited to:

Generative Models + Decision Making for scaling up solutions to complex problems

Voyager: An Open-Ended Embodied Agent with Large Language Models [url]
RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control [url] 

Call for Papers

Generative AI has led to significant advances in natural language, vision, audio, and video.  Such advances can lead to fundamental changes in decision making, and with the aim for bridging generative AI with the  decision making community from control, planning, and reinforcement learning, we invite submissions in this area including the following topics:


Submission instructions:

Submissions should follow the official ICLR 2024 LaTeX template . We welcome technical or positional papers of the following formats: short (4 pages excluding references and appendix), and long (9 pages excluding references and appendix). The main paper and the appendix should be submitted as a single PDF file on the submission website. Papers that have previously been published in other conferences will not be accepted. However, submissions of unpublished or ongoing work, such as works under review at ML conferences, are welcome.


Submission link:

All submissions should be done via OpenReview: https://openreview.net/group?id=ICLR.cc/2024/Workshop/GenAI4DM .


Reviewer recruiting form:

If you are interested in reviewing for the workshop, fill out the form: https://forms.gle/hFiLgN91vBVgvnFp9 


Double-blind policy:

 All submissions should be properly anonymized by the authors to remove any identifying information such as author names, affiliations, personalized Github links, etc.

 Please feel free to reach out with any questions about the CFP or the workshop by emailing genai.dm.iclr2024@gmail.com .

Schedule 

 Important Dates

Paper submission deadline: February 9th 2024 (Anywhere on Earth)

Decision notifications March 13rd 2024 (released)

Camera-ready paper deadline: N/A since the workshop has no proceedings

Workshop: May 11th 2024

Speakers

Noam Brown

OpenAI

Igor Mordatch

Google DeepMind

Katja Hofmann

Microsoft Research

Yuandong Tian

Meta (FAIR)

Jeannette Bohg

Stanford

Karthik Narasimhan

Princeton

 Organizers

Bogdan Mazoure

Apple

Devon Hjelm

Apple

Lisa Lee

Google DeepMind

Roberta Raileanu

Meta AI Research

Yilun Du

MIT

Walter Talbott

Apple

Alexander Toshev

Apple

Katherine Metcalf

Apple

Program Committee

TBD

References

[1] Hoffmann, Jordan, et al. "Training compute-optimal large language models." 2022. 

[2] Alayrac, Jean-Baptiste, et al. "Flamingo: a visual language model for few-shot learning." 2022.

[3] OpenAI. "GPT-4V(ision) system card." 2023.

[4] Rombach, Robin, et al. "High-resolution image synthesis with latent diffusion models." 2022.

[5] Ho, Jonathan, et al. "Imagen video: High definition video generation with diffusion models." 2022.

[6] Oord, Aaron van den, et al. "Wavenet: A generative model for raw audio." 2016.

[7] Liu, Haohe, et al. "Audioldm: Text-to-audio generation with latent diffusion models." 2023.

[8] Wang, Guanzhi, et al. "Voyager: An open-ended embodied agent with large language models." 2023.

[9] Lifshitz, Shalev, et al. "STEVE-1: A Generative Model for Text-to-Behavior in Minecraft." 2023.

[10] Baker, Bowen, et al. "Video pretraining (vpt): Learning to act by watching unlabeled online videos." 2022.