Accepted Papers
Discord link for the virtual poster presentations
https://discord.gg/qR5y65G5Cz
Safety and Fairness for Content Moderation in Generative Models [paper]
Bhaktipriya Radharapu Piyush Kumar Renee Shelby Sarah Laszlo Shivani Poddar Susan Hao
With significant advances in generative AI, new tech-nologies are rapidly being deployed with generative com-ponents. Generative models are typically trained on largedatasets, resulting in model behaviors that can mimic theworst of the content in the training data. Responsible de-ployment of generative technologies requires content mod-eration strategies, such as safety input and output filters.Here, we provide a theoretical framework for conceptualiz-ing responsible content moderation of text-to-image genera-tive technologies, including a demonstration of how to em-pirically measure the constructs we enumerate. We defineand distinguish the concepts of safety, fairness, and met-ric equity, and enumerate example harms that can come ineach domain. We then provide a demonstration of how thedefined harms can be quantified. We conclude with a sum-mary of how the style of harms quantification we demon-strate enables data-driven content moderation decisions.
Interactive Fashion Content Generation Using LLMs and Latent Diffusion Models [paper]
Krishna Sri Ipsit Mantri (Indian Institute of Technology Bombay)*; Nevasini NA Sasikumar (PESU)
Fashionable image generation aims to synthesize images of diverse fashion prevalent around the globe, helping fashion designers in real-time visualization by giving them a basic customized structure of how a specific design preference would look in real life and what further improvements can be made for enhanced customer satisfaction. Moreover, users can alone interact and generate fashionable images by just giving a few simple prompts. Recently, diffusion models have gained popularity as generative models owing to their flexibility and generation of realistic images from Gaussian noise. Latent diffusion models are a type of generative model that use diffusion processes to model the generation of complex data, such as images, audio, or text. They are called "latent" because they learn a hidden representation, or latent variable, of the data that captures its underlying structure. We propose a method exploiting the equivalence between diffusion models and energy-based models (EBMs) and suggesting ways to compose multiple probability distributions. We describe a pipeline on how our method can be used specifically for new fashionable outfit generation and virtual try-on using LLM-guided text-to-image generation. Our results indicate that using an LLM to refine the prompts to the latent diffusion model assists in generating globally creative and culturally diversified fashion styles and reducing bias.
Generative AI: Challenges and Opportunities in the Context of India [paper]
Viraj Shah (University of Illinois, Urbana-Champaign)*; Kartik K Patel (The University of Texas at Austin)
The recent rise in popularity of generative AI fueled increased interest in understanding the cross-cultural performance of such models when used for content creation. India, with its large population and rich cultural diversity, offers an ideal context for analyzing the potential of generative AI in content creation and for exploring the challenges of adapting this technology to diverse cultures globally. This position paper aims to draw attention to several unique challenges that generative AI may encounter in the Indian context and to initiate a discussion on potential research directions to enhance the reliability of generative AI in this context.
Towards More Realistic Membership Inference Attacks on Large Diffusion Models [paper]
Jan Michał Dubiński (Warsaw University of Technology)*; Antoni Kowalczuk (Warsaw University of Technology); Stanisław Pawlak (Warsaw University of Technology); Przemyslaw Rokita (Warsaw University of Technology); Pawel Morawiecki (Polish Academy of Sciences); Tomasz Trzcinski (Warsaw University of Technology, Tooploox, Jagiellonian University)
Generative diffusion models such as Stable Diffusion or Midjourney are now able to generate beautiful, diverse, high-resolution images for a wide range of applications. These models are trained on billions of images scraped from the Internet. When operating at such a scale, there appears a serious concern for unlawful usage of images protected by copyright. In this paper we investigate the question whether a given image was used in the training set - a problem known as the membership inference attack. We focus on Stable Diffusion and identify a real challenge of designing a fair evaluation setup for answering the membership question. We propose a methodology to create a fair setup and apply it to Stable Diffusion. Having the evaluation setup, we conduct membership attacks (both known and newly introduced). We highlight that previously proposed evaluation setups could give a very misleading picture of the effectiveness of the membership attacks. Our findings lead to the conclusion that for large diffusion models (often deployed as the black box), the membership inference attack is still a serious challenge. Consequently, the related privacy and ownership concerns will remain in the near future.
Towards AI Art Curation: Re-imagining the city of Helsinki in occasion of its Biennial [paper]
Dario Negueruela del Castillo (University of Zurich); Ludovica Schaerf (University of Zurich)*; Pepe Ballesteros Zapata (University of Zurich) Iacopo Neri (University of Zurich); Valentine Bernasconi (Digital Visual Studies, University of Zurich);
Art curatorial processes are characterized by the presentation of a collection of artworks in a knowledgeable way. Machine processes are characterized by their capacity to manage and analyze large amounts of data. This paper envisages machine curation and audience interaction as a means to explore the implications of contemporary AI models for the curatorial world. This project was developed for the occasion of the 2023 Helsinki Art Biennial, entitled New Directions May Emerge. We use the Helsinki Art Museum (HAM) collection to re-imagine the city of Helsinki through the lens of machine perception. We use visual-textual models to place indoor artworks in public spaces, assigning fictional coordinates based on similarity scores. Synthetic 360° art panoramas are generated using diffusion-based models to propose a machinic visual style guided by the artworks. The result of this project will be virtually presented as a web-based installation, where such a re-contextualization allows the navigation of an alternative version of the city while exploring its artistic heritage. Finally, we discuss our contributions to machine curation and the ethical implications that such a process entails.
Inspecting the Geographical Representativeness of Images from Text-to-Image Models [paper]
Abhipsa Basu (Indian Institute of Science, Bangalore)*; Venkatesh Babu RADHAKRISHNAN (Indian Institute of Science); Danish Pruthi (Indian Institute of Science)
Recent progress in generative models has resulted in models that produce both realistic as well as relevant images for most textual inputs. These models hold the potential to drastically impact areas such as generative art, digital marketing and data augmentation. Given their outsized impact, one must ensure that the generated content reflects the artifacts and surroundings across the globe, rather than<br/>over-representing certain parts of the world. In this paper, we measure the geographical representativeness of common nouns (e.g., a house) generated through DALL·E 2 and Stable Diffusion models using a crowdsourced study comprising 540 participants across 27 countries. For deliberately underspecified inputs without country names, the generated images most reflect the surroundings of the United States followed by India, and the top generations rarely reflect surroundings from all other countries (average score less than 3 out of 5). Specifying the country names in the input increases the representativeness by 1.44 points for DALL·E 2 and 0.75 for Stable Diffusion, however, the overall scores for many countries still remain low, highlighting the need for future models to be more geographically inclusive.
VizWiz-FewShot: Locating Objects in Images Taken by People With Visual Impairments [paper]
Yu-Yun Tseng (University of Colorado Boulder)*; Alexander Bell (IVC Group); Danna Gurari (University of Colorado Boulder)
We introduce a few-shot localization dataset originating from photographers who authentically were trying to learn about the visual content in the images they took. It includes nearly 10,000 segmentations of 100 categories in over 4,500 images that were taken by people with visual impairments. Compared to existing few-shot object detection and instance segmentation datasets, our dataset is the first to locate holes in objects (e.g., found in 12.3% of our segmentations), it shows objects that occupy a much larger range of sizes relative to the images, and text is over five times more common in our objects (e.g., found in 22.4% of our segmentations). Analysis of three modern few-shot localization algorithms demonstrates that they generalize poorly to our new dataset. The algorithms commonly struggle to locate objects with holes, very small and very large objects, and objects lacking text. To encourage a larger community to work on these unsolved challenges, we publicly share our annotated few-shot dataset at https://vizwiz.org.<br/>
Digital Overconsumption and Waste: A Closer Look at the Impacts of Generative AI [paper]
Vanessa B Utz (Simon Fraser University)*; Steve DiPaola (Simon Fraser University)
Generative Artificial Intelligence (AI) systems currently contribute negatively to the production of digital waste, through high energy consumption and the related CO2 emissions. At this moment, a discussion is needed on the replication of harmful consumer behavior, namely over-consumption, in the digital space. We outline our previous work on the climate implications of commercially available generative AI systems and the sentiment of generative AI users when confronted with AI-related climate research. We expand on this work via a discussion of digital over-consumption and waste, other related societal impacts and a possible solution pathway.