Publications

S Zhu, A Karpovich, A Chen, J Koscheka, S Jannu, D Wen, Y Zhu, R Jain, A. Geramifard, "Agentic Reinforcement Learning for Real-World Code Repair", arXiv 2025 [paper]
S Zhu, Y Jiang, H Sang, S Tang, Q Song, B He, R Jain, Z Wang, A Geramifard, "Planner-r1: Reward shaping enables efficient agentic rl with smaller LLMs", arXiv 2025 [paper]
C. Gunasekara, ... A. Geramifard, ... "Overview of the Ninth Dialog System Technology Challenge: DSTC9", IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024 [paper]
R Chitnis, S Yang, A. Geramifard, "Sequential Decision-Making for Inline Text Auto- complete", Reinforcement Learning Conference (RLC), 2024 [36% acceptance][paper]
P. Bhargava, R. Chitnis, A. Geramifard, S. Sodhani, A. Zhang, "Sequence Modeling is a Robust Contender for Offline Reinforcement Learning", International Conference on Learning Representations (ICLR), 2023 [31% acceptance][paper]
H. Sikchi, R. Chitnis, A. Touati, A. Geramifard, A. Zhang, S. Niekum, "Score Models for Offline Goal-Conditioned Reinforcement Learning", International Conference on Learning Representations (ICLR), 2023 [31% acceptance][paper]
T. Huang, S. Halbe, C. Sankar, P. Amini, S. Kottur, A. Geramifard, M. Razaviyayn, A. Beirami, "Robustness through Data Augmentation Loss Consistency", Transactions on Machine Learning Research (TMLR), 2023 [paper]
S. Moon, S. Kottur, A. Geramifard, B. Damavandi, "Navigating Connected Memories with a Task-oriented Dialog System", Empirical Methods in Natural Language Processing (EMNLP), 2022 [paper - Oral]
K. Qian, S. Kottur, A. Beirami, S. Shayandeh, P. Crook, A. Geramifard, Z. Yu, and C. Sankar, "Database search results disambiguation for task-oriented dialog systems," 2022 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2022 [paper]
S. Kottur, S. Moon, A. Geramifard, B. Damavandi, "SIMMC 2.0: A Task-oriented Dialog Dataset for Immersive Multimodal Conversations", Empirical Methods in Natural Language Processing (EMNLP), 2022 [paper]
H. Le, C. Sankar, S. Moon, A. Beirami, A. Geramifard, and S. Kottur, "DVD: A diagnostic dataset for multi-step reasoning in video grounded dialogue," The 59th Annual Meeting of the Association for Computational Linguistics (ACL), 2021 [paper][code]
K. Qian, A. Beirami, Z. Lin, A. De, A. Geramifard, Z. You, and C. Sankar, "Annotation inconsistency and entity bias in MultiWOZ," The 22nd Annual Meeting of the Special Interest Group on Discourse and Dialogue (SigDial), 2021 [paper][video]
S. Kottur, C. Sankar, Z. Yu, A Geramifard, "DialogStitch: Synthetic Deeper and Multi-Context Task-Oriented Dialogs", Special Interest Group on Discourse and Dialogue (SigDial), 2021 [paper][code]
S. Kottur, P. A. Crook, S. Moon, A. Beirami, E. Cho, R. Subba, and A. Geramifard, "An analysis of state-of-the-art models for situated interactive multimodal conversations (SIMMC)," The 22nd Annual Meeting of the Special Interest Group on Discourse and Dialogue, (SigDial) 2021 [paper][code][video]
T. Huang, S. Halbe, C. Sankar, P. Amini, S. Kottur, A. Geramifard, M. Razaviyayn, and A. Beirami, "DAIR: Data augmented invariant regularization," Transactions on Machine Learning Research (TMLR), 2022. [paper][code]
S. Moon, S. Kottur, P. A. Crook, A. De, S. Poddar, T. Levin, D. Whitney, D. Difranco, A. Beirami, E. Cho, R. Subba, and A. Geramifard, "Situated and interactive multimodal conversations," The 28th International Conference on Computational Linguistics (COLING), 2020 [paper][code]
J. Mendez, A. Geramifard, M. Ghavamzadeh, B. Liu, “Reinforcement Learning of Multi-Domain Dialog Policies Via Action Embeddings”, 3rd Workshop on Conversational AI Today’s Practice Tomorrow’s Potential, NeurIPS, 2019 [paper]
P. A. Crook, S. Poddar, A. De, S. Shafi, D. Whitney, A. Geramifard, R. Subba, "SIMMC: Situated Interactive Multi-Modal Conversational Data Collection And Evaluation Platform", IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), Demonstration Track, 2019 [paper - Best Paper Award]
P. K. Bodigutla, L. Wang, K. Ridgeway, J. Levy, S. Joshi, A. Geramifard, S. Matsoukas, “Domain-Independent turn-level Dialogue Quality Evaluation via User Satisfaction Estimation”, Special Interest Group on Discourse and Dialogue (SigDial), Special Session, 2019 [paper]
M. Fazel-Zarandi, S. Li, J. Cao, P. Henderson, A. Geramifard, "Learning Robust Dialog Policies for Conversational Error Recovery," Amazon Machine Learning Conference (AMLC), 2018 (Oral - 4% acceptance rate)
P. K. Bodigutla, Y. Yan, S. Joshi, A. Geramifard, "Sentiment Analysis in Human-Machine Interaction", Amazon Machine Learning Conference (AMLC), 2017
J. Casale, J. Cao, S. Li, A. Geramifard, "Toward a Conversational Bot - MovieBot", Amazon Machine Learning Conference (AMLC), 2017
M. Fazel-Zarandi, S. Li, J. Cao, J. Casale, D. Whitney, and A. Geramifard, “Learning Robust Dialog Policies in Noisy Environments”, Workshop on Conversational AI Today’s Practice Tomorrow’s Potential, NeurIPS, 2017 [paper]
A. Geramifard, C. Dann, R. H. Klein, W. Dabney, J. P. How, “RLPy: A Value-Function-Based Reinforcement Learning Framework for Education and Research”, Journal of Machine Learning Research, JMLR, 2015 [paper]
S. Ponda, L. B. Johnson, A. Geramifard, J. P. How, “Cooperative Mission Planning for Multi-UAV Teams”, in the Handbook of Unmanned Aerial Vehicles, Chapter 16, Springer, 2014 [Springer Link] [Amazon Link]
A. Geramifard, T. Walsh, S. Tellex, G. Chowdhary, N. Roy, J. P. How, “A Tutorial on Linear Function Approximators for Dynamic Programming and Reinforcement Learning”, Foundations and Trends in Machine Learning (FTML), 2013 [paper]
T. Campbell, R. Klein, A. Geramifard, J. How, “Simultaneous Clustering on Representation Expansion for Learning Multimodel MDPs”, The 1st Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM), 2013 [paper]
A. Geramifard, C. Dann, J. How, “Off-Policy Learning Combined with Automatic Feature Expansion for Solving Large MDPs”, The 1st Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM), 2013 [paper]
C. Amato, G. Chowdhary, A. Geramifard, K. Ure, “Decentralized Control of Partially Observable Markov Decision Processes”, The 52nd IEEE Conference on Decision and Control (CDC), 2013 [paper]
A. Geramifard, T. Walsh, N. Roy, and J. How, “Batch iFDD: A Scalable Matching Pursuit Algorithm for Solving MDPs”, The Conference on Uncertainty in Artificial Intelligence (UAI), 2013 [31% acceptance] [paper]
J. Joseph, A. Geramifard, W. Roberts, J. How and N. Roy, “Reinforcement Learning with Misspecified Model Classes”, IEEE International Conference on Robotics and Automation (ICRA), 2013 [39% acceptance][nominated for Best Cognitive Robotics Paper Award] [paper]
J. Joseph, A. Geramifard, J. How and N. Roy, “Reinforcement Learning with Misspecified Bayesian Nonparametric Model Classes”, Workshop on Bayesian Nonparametric Models For Reliable Planning And Decision-Making Under Uncertainty, NeurIPS, 2012 [paper]
A. Geramifard, J. Redding, J. P. How, “Intelligent Cooperative Control Architecture: A framework for performance improvement using safe learning”, Journal of Intelligent and Robotic Systems (JIRS), Springer, 2012 [paper]
K. Ure, A. Geramifard, G. Chowdhary, J. P. How, “Adaptive Planning for Markov Decision Processes with Uncertain Transition Models via Incremental Feature Dependency Discovery”, European Conference on Machine Learning (ECML), 2012 [24% acceptance] [paper]
A. Geramifard, S. Tellex, D. Wingate, N. Roy, and J. P. How, “A Bayesian Approach to Finding Compact Representations for Reinforcement Learning”, European Workshops on Reinforcement Learning (EWRL), 2012 [68% acceptance] [paper]
A. Geramifard, J. Redding, J. Joseph, N. Roy, and J. P. How, “Model Estimation Within Planning and Learning”, American Control Conference (ACC), 2012 [55% acceptance] [paper]
A. Geramifard, “Practical Reinforcement Learning Using Representation Learning and Safe Exploration for Large Scale Markov Decision Processes”, PhD thesis, Aeronautics and Astronautics Department, Massachusetts Institute of Technology, MA, Nov 2011 [thesis]
A. Geramifard, J. Redding, J. Joseph, and J. P. How, “Model Estimation Within Planning and Learning”, Workshop on Planning and Acting with Uncertain Models, ICML, Bellevue, WA, USA, 2011 [paper]
A. Geramifard, F. Doshi, J. Redding, N. Roy, and J. P. How, “Incremental Feature Dependency Discovery”, Proceedings of the 23rd International Conference on Machine Learning (ICML), 2011, [28% acceptance] [paper] [slides] [poster] [video Unfortunately slides are not synched correctly. Please download them separately.]
J. Redding, T. Toksoz, N. K. Ure, A. Geramifard, J. P. How, M. A. Vavrina, and J. Vian, “Distributed Multi-Agent Persistent Surveillance and Tracking With Health Management”, in AIAA Guidance, Navigation, and Control Conference (GNC), 2011 [paper]
A. Geramifard, J. Redding, N. Roy, and J. P. How, “UAV Cooperative Control with Stochastic Risk Model”, American Control Conference (ACC), 2011 [paper] [bibtex] [slides]
J. Redding, A. Geramifard, H.-L. Choi, and J. P. How, “Actor-critic policy learning in cooperative planning,” in AIAA Guidance, Navigation, and Control Conference (GNC), 2010 [paper][slides]
J. Redding, A. Geramifard, A. Undurti, H. Choi, and J. How, “An intelligent Cooperative Control Architecture,” in American Control Conference (ACC), 2010 [paper]
J. Redding, A. Geramifard, J. How “Actor-Critic Policy Learning in Cooperative Planning”, Embedded Reasoning: Intelligence in Embedded Systems, AAAI Symposium, 2010 [paper]
R. He, A. Bachrach, M. Achtelik, A. Geramifard, D. Gurdan, S. Prentice, J. Stumpf, N. Roy, “On the Design and Use of a Micro Air Vehicle to Track and Avoid Adversaries”, International Journal of Robotics Research (IJRR), 2008 [paper]
A. Bachrach, A. Geramifard, D. Gurdan, R. He, S. Prentice, J. Stumpf, N. Roy, “Co-ordinated Planning Under Uncertainty with Air and Ground Vehicles”, Proceedings of the 11th International Symposium on Experimental Robotics (ISER), 2008. [paper]
R. Sutton, Cs. Szepesvári, A. Geramifard, and M. Bowling, “Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping”, Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence (UAI), pages 528-536, 2008. [28% acceptance] [paper] [slides]
M. Bowling, A. Geramifard, D. Wingate, “Sigma Point Policy Iteration”, Proceedings of the 7th International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS), pages 379-386, 2008. [22% acceptance] [paper]
A. Geramifard, “Incremental Least-Squares Temporal Difference Learning”, Master thesis, Computing Science Department, University of Alberta, Edmonton, AB, Canada, January 2007 [thesis] [slides] [poster]
A. Geramifard, M. Bowling, M. Zinkevich, R. Sutton, “iLSTD: Eligibility Traces & Convergence Analysis ”, In B. Schölkopf and J.C. Platt and T. Hofmann editors, Advances in Neural Information Processing Systems 19 (NIPS), pages 440-448, 2007. [24% acceptance] [paper] [slides]
A. Geramifard, M. Bowling, R. Sutton, “Incremental Least-Square Temporal Difference Learning”, Proceedings of 21t Conference, American Association for Artificial Intelligence (AAAI), pages 356-361, 2006. [30% acceptance] [paper] [slides] [poster]
A. Geramifard, P. Chubak, V. Bulitko, “Biased Cost Pathfinding”, Proceedings of second Conference, Artificial Intelligence and Interactive Digital Entertainment (AIIDE) 2006. [73% acceptance] [paper][poster]
A. Geramifard, P. Nayei, R. Zamaninasab, J. Habibi, "A Hybrid Three Layer Architecture for Fire Agent Management in Rescue Simulation Environment", International Journal of Advanced Robotic Systems, Vol 2,No 2, June 2005 [paper]
A. Nouri, R. Zamani-Nasab, J. Habibi, A. Geramifard, "Task Allocation in Complex Multiagent Systems with Parallel Scheduling ", Workshop on Information Technology & its Disciplines, Kish Island, Iran, February 2004
J. Habibi, M. Ahmadi, A. Nouri, M. M. Nevisi, A. Geramifard, P. Nayeri, M. Sayyadian, H. Khaleghi, M. Motamed, R. Zamaninasab, "Arian Agents: A Set of Implemented Agents for RoboCup Rescue Simulation Environment", In Proceedings of the RoboCup Symposium, Padova, Italy, 2003 [paper]
A. Geramifard, P. Nayeri, "Implementation of Fire Agent & Fire Station in the Rescue Multi-Agent Environment", BSc thesis, Computer Engineering Department, Sharif University of Technology, Tehran, Iran, September 2003 [thesis]