AAAI 2024
ReLM Accpeted Papers
Spotlight Talks
Breaking Free Transformer Models: Task-specific Context Attribution Promises Improved Generalizability Without Fine-tuning Pre-trained LLMs, Stepan Tytarenko (Fordham University); Mohammad Ruhul Amin (Fordham University) - Best paper award 🎉🎉🎉
Inverse Prompt Engineering for Safety in Large Language Models, Stewart Slocum (Massachusetts Institute of Technology); Dylan Hadfield-Menell (Massachusetts Institute of Technology) - Best Paper Runner-Up 🎉🎉
Evaluation and Enhancement of Semantic Grounding in Large Vision-Language Models, Jiaying Lu (Emory Univesity); Jinmeng Rao (Mineral); Kezhen Chen (Mineral LLC); Xiaoyuan Guo (Mineral.ai); Yawen Zhang (Mineral.ai); Baochen Sun (X, The Moonshot Factory); Carl Yang (Emory University); Jie Yang (Mineral.ai)
Improving Activation Steering in Language Models with Mean-Centring, Ole K Jorgensen (Independent); Dylan Cope (King's College London); Nandi Schoots (King's College London); Murray Shanahan (Imperial College London)
Ethos: Rectifying Language Models in Orthogonal Parameter Space, Lei Gao (University of Southern California); Yue Niu (University of Southern California); Tingting Tang (University of Southern California); Salman Avestimehr (University of Southern California); Murali Annavaram (University of Southern California)
Towards Generating Informative Textual Description for Neurons in Language Models, Shrayani Mondal (UMass Amherst); Rishabh Garodia (UMass Amherst); Taesung Lee (IBM Research); Youngja Park (IBM Research); Syed Arbaaz Qureshi (University of Massachusetts Amherst)
Posters
Assaying on the Robustness of Zero-Shot Machine-Generated Text Detectors, Yi-Fan Zhang (NLPR, China); Zhang Zhang (Institute of Automation, Chinese Academy of Sciences); Liang Wang (NLPR, China); Tieniu Tan (NLPR, China)
Uncertainty-aware Language Modeling for Selective Question Answering, Qi Yang (Themis AI); Shreya Ravikumar (Themis AI); Fynn Schmitt-Ulms (Themis AI); Satvik Lolla (Themis AI); Ege Demir (Themis AI); Iaroslav Elistratov (Themis AI); Alex Lavaee (Themis AI); Sadhana Lolla (Themis AI); Elaheh Ahmadi (Themis AI); Daniela Rus (MIT CSAIL); Alexander Amini (Massachusetts Institute of Technology); Alejandro Perez (Themis AI)
ReAGent: A Model-agnostic Feature Attribution Method for Generative Language Models, Zhixue Zhao (University of Sheffield); Boxuan Shan (University of Sheffield)
How Robust are LLMs to In-Context Majority Label Bias?, Karan Gupta (Amazon); Sumegh Roychowdhury (Amazon); Siva Rajesh Kasa (Amazon); Santhosh K Kasa (Amazon); Anish Bhanushali (Amazon); Nikhil Pattisapu (Amazon); Prasanna Srinivasa Murthy (Amazon)
Towards LLM-guided Causal Explainability for Black-box Text Classifiers, Amrita Bhattacharjee (Arizona State University); Raha Moraffah (Arizona State University); Joshua Garland (Arizona State University); Huan Liu (Arizona State University)
Calibrating Black Box LLMs with Prompt Variations, Yijin Hua (University of California, Los Angeles); Justin Svegliato (University of California, Berkeley); Sam Toyer (UC Berkeley); Stuart Russell (UC Berkeley); Anoop Sinha (Google)
A Baseline Analysis of Reward Models' Ability To Accurately Analyze Foundation Models Under Distribution Shift, Will LeVine (Scale AI); Benjamin Pikus (Scale AI); Anthony Chen (Scale AI); Sean Hendryx (Scale AI)
AdvGLUE-GPT: Towards Effective and Efficient Robustness Evaluation of Large Language Models, Xilie Xu (National University of Singapore); Keyi Kong (Shandong University); Ning Liu (School of Software, Shandong University); Lizhen Cui (ShanDong University); Di Wang (KAUST); Jingfeng Zhang (University of Auckland); Mohan Kankanhalli (National University of Singapore,)
Plug and Play with Prompts: A Prompt Tuning Approach for Controlling Text Generation, Rohan Deepak Ajwani (University of Toronto); Zining Zhu (University of Toronto); Jonathan S Rose (University of Toronto); Frank Rudzicz (University of Toronto)
A Question on the Explainability of Large Language Models and the Word-Level Univariate First-Order Plausibility Assumption, Jeremie Bogaert (UCLouvain); François-Xavier Standaert (UCLouvain)
Comparing GPT-4 and Open-Source Language Models in Misinformation Mitigation, Tyler Vergho (Dartmouth College); Reihaneh Rabbany (McGill University); Kellin Pelrine (McGill)
Removing RLHF Protections in GPT-4 via Fine-Tuning, Qiusi Zhan (University of Illinois Urbana-Champaign); Richard Fang (University of Illinois Urbana-Champaign); Rohan Bindu (University of Illinois at Urbana-Champaign); Akul Gupta (The University of Illinois Urbana Champaign); Tatsunori Hashimoto (Stanford); Daniel Kang (UIUC)
Ethical Artificial Intelligence Principles and Guidelines for the Governance and Utilization of Highly Advanced Large Language Models, Soaad Q. Hossain (University of Toronto); Syed Ishtiaque Ahmed (University of Toronto)
Tree Patching: New Metrics for Automatic Circuit Discovery with Token Positions, Joseph Miller (FAR AI); William Saunders (OpenAI)
Knowledge Solver: Teaching LLMs to Search for Domain Knowledge from Knowledge Graphs, Chao Feng (UMich); Xinyu Zhang (University of Michigan)