2nd workshop on critical evaluation of
generative models and their impact on society
20 October 2025
at ICCV 2025, Honolulu, Hawaii
20 October 2025
at ICCV 2025, Honolulu, Hawaii
Visual generative models have revolutionized our ability to generate realistic images, videos, and other visual content. However, with great power comes great responsibility. While the computer vision community continues to innovate with models trained on vast datasets to improve visual quality, questions regarding the adequacy of evaluation protocols arise. Automatic measures such as CLIPScore and FID may not fully capture human perception, while human evaluation methods are costly and lack reproducibility. Alongside technical considerations, critical concerns have been raised by artists and social scientists regarding the ethical, legal, and social implications of visual generative technologies. The democratization and accessibility of these technologies exacerbate issues such as privacy, copyright violations, and the perpetuation of social biases, necessitating urgent attention from our community.
This interdisciplinary workshop aims to convene experts from computer vision, machine learning, social sciences, digital humanities, and other relevant fields. By fostering collaboration and dialogue, we seek to address the complex challenges associated with visual generative models and their evaluation, benchmarking, and auditing.
Simran Khanuja is a PhD student at the Language Technologies Institute in the School of Computer Science at Carnegie Mellon University since August 2022. Her research focuses on expanding the capabilities of multimodal systems to serve a wide range of users across languages and cultures, with applications in localization, information access, conversational AI, education, and assistive technologies. Previously, she was a Pre-Doctoral Researcher at Google Research and worked at Microsoft Research. She has made contributions towards advancing under-represented languages in NLP and her work has been published at top NLP conferences like ACL and EMNLP, including best paper awards at EMNLP 2024, IEEE BigData 2024, and SLT 2022. She is also a recipient of the Waibel Presidential Fellowship for 2024-25.
Alice Xiang is the Global Head of AI Ethics at Sony. As the VP leading AI ethics initiatives across Sony Group, she manages the team responsible for conducting AI ethics assessments across Sony's business units and implementing Sony's AI Ethics Guidelines. In addition, as the Lead Research Scientist for AI ethics at Sony AI, Alice leads a lab of AI researchers working on cutting-edge research to enable the development of more responsible AI solutions. Alice also recently served as a General Chair for the ACM Conference on Fairness, Accountability, and Transparency (FAccT), the premier multidisciplinary research conference on these topics. Alice is both a lawyer and statistician, with experience developing machine learning models and serving as legal counsel for technology companies. Alice holds a Juris Doctor from Yale Law School, a Master’s in Development Economics from Oxford, a Master’s in Statistics from Harvard, and a Bachelor’s in Economics from Harvard.
13:30 - 13:35: Opening
13:35 - 14:10: Invited talk by Simran Khanuja
14:10 - 14:55: Oral session
CAIRE: Cultural Attribution of Images by Retrieval-Augmented Evaluation
On the Distributed Evaluation of Generative Models
An Information-Theoretic Approach to Diversity Evaluation of Prompt-based Generative Models
14:55 - 16:00: Coffee break and poster session
16:00 - 16:20: Invited papers session
CuRe: Cultural Gaps in the Long Tail of Text-to-Image Systems
WorldScore: A Unified Evaluation Benchmark for World Generation
16:20 - 16:55: Invited talk by Alice Xiang
16:55 - 17:00: Closing
CAIRE: Cultural Attribution of Images by Retrieval-Augmented Evaluation
Siddharth Yayavaram, Arnav Yayavaram, Simran Khanuja, Michael Saxon, Graham Neubig
On the Distributed Evaluation of Generative Models
Zixiao Wang, Farnia Farzan, Zhenghao Lin, Yunheng Shen, Bei Yu
An Information-Theoretic Approach to Diversity Evaluation of Prompt-based Generative Models
Mohammad Jalali, Azim Ospanov, Amin Gohari, Farzan Farnia
Memorization in 3D Shape Generation
Shu Pu, Boya Zeng, Zhuang Liu
CuRe: Cultural Gaps in the Long Tail of Text-to-Image Systems
Aniket Rege, Zinnia Nie, Mahesh Ramesh, Unmesh Raskar, Zhuoran Yu, Aditya Kusupati, Yong Jae Lee, Ramya Korlakai Vinayak
WorldScore: A Unified Evaluation Benchmark for World Generation
Haoyi Duan, Hong-Xing Yu, Sirui Chen, Li Fei-Fei, Jiajun Wu
Bias in Gender Bias Benchmarks: How Spurious Features Distort Evaluation
Yusuke Hirota, Ryo Hachiuma, Boyi Li, Ximing Lu, Michael Ross Boone, Boris Ivanovic, Yejin Choi, Marco Pavone, Yu-Chiang Frank Wang, Noa Garcia, Yuta Nakashima, Chao-Han Huck Yang
DiffTell: A High-Quality Dataset for Describing Image Manipulation Changes
Zonglin Di, Jing Shi, Yifei Fan, Hao Tan, Alexander Black, John Collomosse, Yang Liu
Noa Garcia
The University of Osaka
Amelia Katirai
University of Tsukuba
Kento Masui
CyberAgent
Mayu Otani
CyberAgent
Yankun Wu
The University of Osaka
To contact the organizers please use cegis-workshop@googlegroups.com