Schedule
The schedule below is tentative and subject to change.
Week 1: January 16
Georgila - Overview, different types of dialogue, example dialogue systems, topics to be covered
Week 2: January 23
Georgila - Continuation of overview, basic principles of dialogue processing (initiative, grounding, dialogue acts, turn-taking), knowledge-based dialogue management (information states, logic-based approaches)
Assignment 1 handed out
Week 3: January 30
Georgila - Reinforcement learning and simulated users for dialogue management (Part 1)
Week 4: February 6
Georgila - Reinforcement learning and simulated users for dialogue management (Part 2)
Assignment 1 due (Thursday February 5, 11:59 pm)
Week 5: February 13
Georgila - Data collection, dialogue corpora and annotation, dialogue evaluation (manual and automatic)
Assignment 2 handed out
Week 6: February 20
Georgila - Speech recognition and speech synthesis for dialogue
Week 7: February 27
Georgila - Deep learning approaches to dialogue (including end-to-end architectures and chatbots), dialogue state tracking
Assignment 2 due (Monday March 2, 11:59 pm)
Selected special topic due (Monday March 2, 11:59 pm)
Week 8: March 6
Georgila - Reinforcement learning from human and AI feedback, natural language understanding, natural language generation
Project white paper due (Thursday March 5, 11:59 pm)
Week 9: March 13
Georgila - Multi-party dialogue, turn-taking, team dialogue, healthcare applications
March 20 - Spring Break
Week 10: March 27
Guest lecture: Prof. David Traum - Grounding
Week 11: April 3
Project proposal due (Thursday April 2, 11:59 pm)
Student topic presentations
Visual dialogue - Kaushal, Grace, Arushi - 30 min
Long, Yuxing, Xiaoqi Li, Wenzhe Cai, and Hao Dong. Discuss before moving: Visual language navigation via multi-expert discussions. In 2024 IEEE International Conference on Robotics and Automation (ICRA), pp. 17380-17387. IEEE, 2024.
Qiao, Yanyuan, Qianyi Liu, Jiajun Liu, Jing Liu, and Qi Wu. LLM as copilot for coarse-grained vision-and-language navigation. In European Conference on Computer Vision, pp. 459-476. Cham: Springer Nature Switzerland, 2024.
Han, Leekyeung, Hyunji Min, Gyeom Hwangbo, Jonghyun Choi, and Paul Hongsuck Seo. DialNav: Multi-turn Dialog Navigation with a Remote Guide. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8514-8523. 2025.
Mixed-initiative dialogue - Runhui, Xixiao - 20 min
Yuxiang Nie, Heyan Huang, Xian-Ling Mao, and Lizi Liao. 2024. Mix-Initiative Response Generation with Dynamic Prefix Tuning. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 8748–8761, Mexico City, Mexico. Association for Computational Linguistics.
Maximillian Chen, Ruoxi Sun, Tomas Pfister, and Sercan O Arik. Learning to Clarify: Multi-turn Conversations with Action-Based Contrastive Self-Training. ICLR 2025.
Manual and automatic evaluation metrics for task-oriented dialogue - Tiannuo, Linxin - 20 min
Arihant Jain, Purav Aggarwal, Rishav Sahay, Chaosheng Dong, and Anoop Saladi. 2025. AutoEval-ToD: Automated Evaluation of Task-oriented Dialog Systems. In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 10133–10148, Albuquerque, New Mexico. Association for Computational Linguistics.
Emre Can Acikgoz, Carl Guo, Suvodip Dey, Akul Datta, Takyoung Kim, Gokhan Tur, and Dilek Hakkani-Tur. 2025. TD-EVAL: Revisiting Task-Oriented Dialogue Evaluation by Combining Turn-Level Precision with Dialogue-Level Comparisons. In Proceedings of the 26th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 113–132, Avignon, France. Association for Computational Linguistics.
Non-cooperative dialogue systems - Athirai - 15 min
Mike Lewis, Denis Yarats, Yann Dauphin, Devi Parikh, and Dhruv Batra. 2017. Deal or No Deal? End-to-End Learning of Negotiation Dialogues. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 2443–2453, Copenhagen, Denmark. Association for Computational Linguistics.
Companion dialogue systems - Chi - 15 min
Zheyong Xie, Shaosheng Cao, Zuozhu Liu, Zheyu Ye, Zihan Niu, Chonggang Lu, Tong Xu, Enhong Chen, Zhe Xu, Yao Hu, and Wei Lu. 2025. iPET: An Interactive Emotional Companion Dialogue System with LLM-Powered Virtual Pet World Simulation. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations), pages 416–425, Vienna, Austria. Association for Computational Linguistics.
Turn-taking - Nithi - 15 min
Choi, Min Gyeong, and Sun-Young Oh. "Developing L2 turn-taking with ChatGPT: A longitudinal conversation analytic study." System 138 (2026): 103959.
Embodied conversational agents - Fatemeh, Phillip - 20 min
Angus Addlesee, Neeraj Cherakara, Nivan Nelson, Daniel Hernandez Garcia, Nancie Gunson, Weronika Sieińska, Christian Dondrup, and Oliver Lemon. 2024. Multi-party Multimodal Conversations Between Patients, Their Companions, and a Social Robot in a Hospital Memory Clinic. In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations, pages 62–70, St. Julians, Malta. Association for Computational Linguistics.
Neeraj Cherakara, Finny Varghese, Sheena Shabana, Nivan Nelson, Abhiram Karukayil, Rohith Kulothungan, Mohammed Afil Farhan, Birthe Nesset, Meriam Moujahid, Tanvi Dinkar, Verena Rieser, and Oliver Lemon. 2023. FurChat: An Embodied Conversational Agent using LLMs, Combining Open and Closed-Domain Dialogue with Facial Expressions. In Proceedings of the 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 588–592, Prague, Czechia. Association for Computational Linguistics.
Agnes Axelsson and Gabriel Skantze. Do you follow? A fully automated system for adaptive robot presenters. International Conference on Human Robot Interaction, 2023.
NLU for dialogue - Valliammai, Aarushi - 20 min
Kalpa Gunaratna, Vijay Srinivasan, Akhila Yerukola, and Hongxia Jin. Explainable Slot Type Attentions to Improve Joint Intent Detection and Slot Filling. EMNLP Findings, 2022.
Omar Shaikh, Kristina Gligoric, Ashna Khetan, Matthias Gerstgrasser, Diyi Yang, and Dan Jurafsky. 2024. Grounding Gaps in Language Model Generations. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 6279–6296, Mexico City, Mexico. Association for Computational Linguistics.
Week 12: April 10
Assignment 3 handed out
Student topic presentations
Modeling/recognizing affect in dialogue systems - Emily, Bowen - 20 min
Pierre Colombo, Wojciech Witon, Ashutosh Modi, James Kennedy, and Mubbasir Kapadia. 2019. Affect-Driven Dialog Generation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 3734–3743, Minneapolis, Minnesota. Association for Computational Linguistics.
Nicholas Sofroniew, Isaac Kauvar, William Saunders, Runjin Chen, Tom Henighan, Sasha Hydrie, Craig Citro, Adam Pearce, Julius Tarng, Wes Gurnee, Joshua Batson, Sam Zimmerman, Kelley Rivoire, Kyle Fish, Chris Olah, and Jack Lindsey. Emotion concepts and their function in a large language model. 2026.
Multi-party conversations - Sheryl, Lydia - 20 min
Hiroki Ouchi and Yuta Tsuboi. 2016. Addressee and Response Selection for Multi-Party Conversation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pages 2133–2143, Austin, Texas. Association for Computational Linguistics.
Maira Gatti de Bayser, Melina Alberio Guerra, Paulo Cavalin, and Claudio Pinhanez. A Hybrid Solution to Learn Turn-Taking in Multi-Party Service-based Chat Groups. 2020.
Nicolò Penzo, Maryam Sajedinia, Bruno Lepri, Sara Tonelli, and Marco Guerini. 2024. Do LLMs suffer from Multi-Party Hangover? A Diagnostic Approach to Addressee Recognition and Response Selection in Conversations. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 11210–11233, Miami, Florida, USA. Association for Computational Linguistics.
Ronald Petrick and Mary Ellen Foster. Planning for social interaction in a robot bartender domain. International Conference on Automated Planning and Scheduling. 2013.
Chelsea, Yihe - 20 min
Lakshya A Agrawal, Shangyin Tan, Dilara Soylu, Noah Ziems, Rishi Khare, Krista Opsahl-Ong, Arnav Singhvi, Herumb Shandilya, Michael J Ryan, Meng Jiang, Christopher Potts, Koushik Sen, Alexandros G. Dimakis, Ion Stoica, Dan Klein, Matei Zaharia, and Omar Khattab. GEPA: Reflective prompt evolution can outperform reinforcement learning. ICLR 2026.
Qizheng Zhang, Changran Hu, Shubhangi Upasani, Boyuan Ma, Fenglu Hong, Vamsidhar Kamanuru, Jay Rainton, Chen Wu, Mengmeng Ji, Hanchen Li, Urmish Thakker, James Zou, and Kunle Olukotun. Agentic context engineering: Evolving contexts for self-improving language models. ICLR 2026.
Manual and automatic evaluation metrics for task-oriented dialogue - Autumn, Vincent-Daniel - 20 min
Abishek Komma, Nagesh Panyam Chandrasekarasastry, Timothy Leffel, Anuj Goyal, Angeliki Metallinou, Spyros Matsoukas, and Aram Galstyan. 2023. Toward More Accurate and Generalizable Evaluation Metrics for Task-Oriented Dialogs. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 5: Industry Track), pages 186–195, Toronto, Canada. Association for Computational Linguistics.
Jiseung Hong, Grace Byun, Seungone Kim, and Kai Shu. 2025. Measuring Sycophancy of Language Models in Multi-turn Dialogues. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 2239–2259, Suzhou, China. Association for Computational Linguistics.
Text2LoRA and Doc2LoRA - Daniel, Deyang - 20 min
Rujikorn Charakorn, Edoardo Cetin, Yujin Tang, and Robert Tjarko Lange. Text-to-LoRA: Instant transformer adaption. ICML 2025.
Rujikorn Charakorn, Edoardo Cetin, Shinnosuke Uesaka, and Robert Tjarko Lange. Doc-to-LoRA: Learning to instantly internalize contexts. 2026.
Indigenous ASR - Faith - 15 min
Robbie Jimerson and Emily Prud’hommeaux. 2018. ASR for Documenting Acutely Under-Resourced Indigenous Languages. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan. European Language Resources Association (ELRA).
Juann - 15 min
Runjin Chen, Andy Arditi, Henry Sleight, Owain Evans, and Jack Lindsey. Persona Vectors: Monitoring and Controlling Character Traits in Language Models. 2025.
Leonardo - 15 min
Asma Ghandeharioun, Ann Yuan, Marius Guerard, Emily Reif, Michael A. Lepori, and Lucas Dixon. Who's asking? User personas and the mechanics of latent misalignment. NeurIPS 2024.
Chris - 15 min
Federico Bianchi, Patrick John Chia, Mert Yuksekgonul, Jacopo Tagliabue, Dan Jurafsky, and James Zou. How well can LLMs negotiate? NEGOTIATIONARENA platform and analysis. Proceedings of the 41st International Conference on Machine Learning 2024.
From LLM to chatbot, post-training technique - Leland, Changhui - 20 min
Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, Sebastian Riedel, and Douwe Kiela. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. NeurIPS 2020.
Charles Packer, Sarah Wooders, Kevin Lin, Vivian Fang, Shishir G. Patil, Ion Stoica, and Joseph E. Gonzalez. MemGPT: Towards LLMs as operating systems. 2024.
Percy Liang et al. Holistic evaluation of language models. Transactions on Machine Learning Research 2023.
Stephanie Lin, Jacob Hilton, and Owain Evans. 2022. TruthfulQA: Measuring How Models Mimic Human Falsehoods. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 3214–3252, Dublin, Ireland. Association for Computational Linguistics.
Week 13: April 17
Student topic presentations
Dialogue state tracking - Abhijith - 15 min
Zhaojiang Lin, Bing Liu, Andrea Madotto, Seungwhan Moon, Zhenpeng Zhou, Paul Crook, Zhiguang Wang, Zhou Yu, Eunjoon Cho, Rajen Subba, and Pascale Fung. 2021. Zero-Shot Dialogue State Tracking via Cross-Task Transfer. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 7890–7900, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
Student project presentations
Internalization of biases in large language models - Faith, Leonardo - 30 min
PolyPersona: Cross-lingual structure of personality directions in large language models - Juann, Nithi - 30min
Training a high-efficiency multi-task transformer - Changhui, Leland - 30 min
Pitch: Cost-aware automatic evaluation for task-oriented dialogue agents - Tiannuo, Linxin - 30 min
LLM sycophancy in multi-turn reasoning dialogues - Autumn, Vincent-Daniel - 30 min
Week 14: April 24
Assignment 3 due (Thursday April 23, 11:59 pm)
Student project presentations
Logic script writer: Multi-agent collaboration for logical consistency in script writing - Chi - 25 min
Socially intelligent LLM tutor - Sheryl - 25 min
End-to-end robotic gesture synthesis from dialogue for Blossom Squish and Stretch Robot - Lydia - 25 min
Vision-language reinforcement learning for multi-turn dialogue agents in maze navigation - Grace, Athirai - 30 min
Gounded dialogue: Ontology-driven scaffolding in conversational AI tutoring system - Kaushal, Arushi - 30 min
Temporal dialogue memory graphs for long-term conversational reasoning - Runhui, Xixiao - 30 min
Dialogue state tracking - Abhijith - 25 min
Week 15: May 1
Student project presentations
Code-switching robustness in voice agents - Chelsea, Yihe - 30 min
Autobiographical interviews - Emily - 25 min
LLM-based reward decomposition for task-oriented dialogue policy learning - Fatemeh, Phillip - 30 min
Evaluating LLM robustness via heterogenous multi-agent debate (H-MAD) - Deyang, Aarushi, Valliammai - 35 min
A data generation pipeline for empathetic sycophancy in mental health dialogue - Bowen - 25 min
Domain-adaptation - Daniel C. R. - 25 min
TBD - Chris - 25 min
Examination period: May 6
Student project reports due (May 6, 4 pm)