Aug 19, 2024: Course Logistics and Intro to Modeling Social Factors in NLP [Snigdha's slides]
Aug 26, 2024: Social Intelligence in LLMs
Natalie Shapira, Mosh Levy, Hossein Seyed Alavi, Xuhui Zhou, Yejin Choi, Yoav Goldberg, Maarten Sap & Vered Shwartz, "Clever Hans or Neural Theory of Mind? Stress Testing Social Reasoning in Large Language Models", EACL 2024 [Bek's slides]
Ruibo Liu, Ruixin Yang, Chenyan Jia, Ge Zhang, Denny Zhou, Andrew M. Dai, Diyi Yang, Soroush Vosoughi, "Training Socially Aligned Language Models on Simulated Social Interactions", ICLR 2024 [Rajeev's slides]
Sep 9, 2024: Recap of Transformers and Prompting Techniques [Snigdha's slides]
Sep 16, 2024: Reinforcement Learning from Human Feedback [Snigdha's slides] and Project topic discussion
Sep 23, 2024: Wellness day
Sep 30, 2024: How to align LLMs to Human judgment?
Po-Nien Kung, Fan Yin, Di Wu, Kai-Wei Chang, and Nanyun Peng, “Active Instruction Tuning: Improving Cross-Task Generalization by Training on Prompt Sensitive Tasks”, EMNLP 2023 [Rui's slides]
Rafael Rafailov, Archit Sharma, Eric Mitchell, Stefano Ermon, Christopher D. Manning, Chelsea Finn, "Direct Preference Optimization: Your Language Model is Secretly a Reward Model", NeurIPS 2023 [Titus's slides][notes]
Oct 8, 2024: Can LLMs work with humans?
Shashank Gupta, Vaishnavi Shrivastava, Ameet Deshpande, Ashwin Kalyan, Peter Clark, Ashish Sabharwal, and Tushar Khot, "Bias runs deep: Implicit reasoning biases in persona-assigned LLMs [Luxuan's slides]
Ameet Deshpande, Vishvak Murahari, Tanmay Rajpurohit, Ashwin Kalyan, and Karthik Narasimhan, "Toxicity in chatgpt: Analyzing persona-assigned language models" [Yijun's slides]
Oct 14, 2024: midterm project presentations
Oct 21, 2024: Can LLMs reason like humans?
Yi Zeng, Hongpeng Lin, Jingwen Zhang, Diyi Yang, Ruoxi Jia, Weiyan Shi, "How Johnny Can Persuade LLMs to Jailbreak Them: Rethinking Persuasion to Challenge AI Safety by Humanizing LLMs", ACL 2024 [Mason's slides]
Andreas Opedal, Alessandro Stolfo, Haruki Shirakami, Ying Jiao, Ryan Cotterell, Bernhard Schölkopf, Abulhair Saparov, Mrinmaya Sachan. Do Language Models Exhibit the Same Cognitive Biases in Problem Solving as Human Learners? ICML 2024 [Vedant's slides]
Oct 28:
Xuhui Zhou, Hao Zhu, Leena Mathur, Ruohong Zhang, Haofei Yu, Zhengyang Qi, Louis-Philippe Morency, Yonatan Bisk, Daniel Fried, Graham Neubig & Maarten Sap, "SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents", ICLR (2024) [Tripp's slides]
Akshita Jha, Vinodkumar Prabhakaran, Remi Denton, Sarah Laszlo, Shachi Dave, Rida Qadri, Chandan K. Reddy, Sunipa Dev: ViSAGe: A Global-Scale Analysis of Visual Stereotypes in Text-to-Image Generation. ACL 2024 [Nicholas's slides]
Nov 4:
[Guest lecture: Anvesh Rao Vijjini] Anvesh Rao Vijjini, Rakesh R Menon, Shashank Srivastava, Snigdha Chaturvedi. SocialGaze: Improving the Integration of Human Social Norms in Large Language Models. EMNLP Findings 2024 [paper link]
Samuel Cahyawijaya, Delong Chen, Yejin Bang, Leila Khalatbari, Bryan Wilie, Ziwei Ji, Etsuko Ishii, Pascale Fung, "High-Dimension Human Value Representation in Large Language Models", arxiv 2024 [sanjay's slides]
Nov 11: Guest lecture by Elias Stengel-Eskin: https://esteng.github.io/ [slides]
Nov 18:
Yuanshun Yao Xiaojun Xu Yang Liu, "Large Language Model Unlearning an example of applying unlearning for LLM to reduce generate copyrighted or harmful content", ICLR 2024 [Nolan's slides]
Shangbin Feng, Chan Young Park, Yuhan Liu, Yulia Tsvetkov, "From Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases Leading to Unfair NLP Models analyzing pretraining data's effect on the political bias", ACL 2023 [Snehashish's slides]
Nov 25:
Ella Li, Taiwei Shi, Caleb Ziems, Min-Yen Kan, Nancy F. Chen, Zhengyuan Liu, Diyi Yang, "CoAnnotating: Uncertainty-Guided Work Allocation between Human and Large Language Models for Data Annotation", EMNLP 2023 [slides]
Michael J. Ryan, William Held, Diyi Yang, Unintended Impacts of LLM Alignment on Global Representation, ACL 2024 [slides]
Dec 2: Final Project Presentations