Search this site
Embedded Files
  • Home
  • Publications
  • Code
 
  • Home
  • Publications
  • Code
  • More
    • Home
    • Publications
    • Code
  1. S Zhu, A Karpovich, A Chen, J Koscheka, S Jannu, D Wen, Y Zhu, R Jain, A. Geramifard, "Agentic Reinforcement Learning for Real-World Code Repair", arXiv 2025 [paper]

  2. S Zhu, Y Jiang, H Sang, S Tang, Q Song, B He, R Jain, Z Wang, A Geramifard, "Planner-r1: Reward shaping enables efficient agentic rl with smaller LLMs", arXiv 2025 [paper]

  3. C. Gunasekara, ... A. Geramifard, ... "Overview of the Ninth Dialog System Technology Challenge: DSTC9", IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024 [paper]

  4. R Chitnis, S Yang, A. Geramifard, "Sequential Decision-Making for Inline Text Auto- complete", Reinforcement Learning Conference (RLC), 2024 [36% acceptance][paper]

  5. P. Bhargava, R. Chitnis, A. Geramifard, S. Sodhani, A. Zhang, "Sequence Modeling is a Robust Contender for Offline Reinforcement Learning", International Conference on Learning Representations (ICLR), 2023 [31% acceptance][paper]

  6. H. Sikchi, R. Chitnis, A. Touati, A. Geramifard, A. Zhang, S. Niekum, "Score Models for Offline Goal-Conditioned Reinforcement Learning", International Conference on Learning Representations (ICLR), 2023 [31% acceptance][paper]

  7. T. Huang, S. Halbe, C. Sankar, P. Amini, S. Kottur, A. Geramifard, M. Razaviyayn, A. Beirami, "Robustness through Data Augmentation Loss Consistency", Transactions on Machine Learning Research (TMLR), 2023 [paper]

  8. S. Moon, S. Kottur, A. Geramifard, B. Damavandi, "Navigating Connected Memories with a Task-oriented Dialog System", Empirical Methods in Natural Language Processing (EMNLP), 2022 [paper - Oral]

  9. K. Qian, S. Kottur, A. Beirami, S. Shayandeh, P. Crook, A. Geramifard, Z. Yu, and C. Sankar, "Database search results disambiguation for task-oriented dialog systems," 2022 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2022 [paper]

  10. S. Kottur, S. Moon, A. Geramifard, B. Damavandi, "SIMMC 2.0: A Task-oriented Dialog Dataset for Immersive Multimodal Conversations", Empirical Methods in Natural Language Processing (EMNLP), 2022 [paper]

  11. H. Le, C. Sankar, S. Moon, A. Beirami, A. Geramifard, and S. Kottur, "DVD: A diagnostic dataset for multi-step reasoning in video grounded dialogue," The 59th Annual Meeting of the Association for Computational Linguistics (ACL), 2021 [paper][code]

  12. K. Qian, A. Beirami, Z. Lin, A. De, A. Geramifard, Z. You, and C. Sankar, "Annotation inconsistency and entity bias in MultiWOZ,"  The 22nd Annual Meeting of the Special Interest Group on Discourse and Dialogue (SigDial), 2021 [paper][video]

  13. S. Kottur, C. Sankar, Z. Yu, A Geramifard, "DialogStitch: Synthetic Deeper and Multi-Context Task-Oriented Dialogs", Special Interest Group on Discourse and Dialogue (SigDial), 2021 [paper][code]

  14. S. Kottur, P. A. Crook, S. Moon, A. Beirami, E. Cho, R. Subba, and A. Geramifard, "An analysis of state-of-the-art models for situated interactive multimodal conversations (SIMMC)," The 22nd Annual Meeting of the Special Interest Group on Discourse and Dialogue, (SigDial) 2021 [paper][code][video]

  15. T. Huang, S. Halbe, C. Sankar, P. Amini, S. Kottur, A. Geramifard, M. Razaviyayn, and A. Beirami, "DAIR: Data augmented invariant regularization," Transactions on Machine Learning Research (TMLR), 2022. [paper][code]

  16. S. Moon, S. Kottur, P. A. Crook, A. De, S. Poddar, T. Levin, D. Whitney, D. Difranco, A. Beirami, E. Cho, R. Subba, and A. Geramifard, "Situated and interactive multimodal conversations," The 28th International Conference on Computational Linguistics (COLING), 2020 [paper][code]

  17. J. Mendez, A. Geramifard, M. Ghavamzadeh, B. Liu, “Reinforcement Learning of Multi-Domain Dialog Policies Via Action Embeddings”, 3rd Workshop on Conversational AI Today’s Practice Tomorrow’s Potential, NeurIPS, 2019 [paper]

  18. P. A. Crook, S. Poddar, A. De, S. Shafi, D. Whitney, A. Geramifard, R. Subba, "SIMMC: Situated Interactive Multi-Modal Conversational Data Collection And Evaluation Platform", IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), Demonstration Track, 2019 [paper - Best Paper Award]

  19. P. K. Bodigutla, L. Wang, K. Ridgeway, J. Levy, S. Joshi, A. Geramifard, S. Matsoukas, “Domain-Independent turn-level Dialogue Quality Evaluation via User Satisfaction Estimation”, Special Interest Group on Discourse and Dialogue (SigDial), Special Session, 2019 [paper]

  20. M. Fazel-Zarandi, S. Li, J. Cao, P. Henderson, A. Geramifard, "Learning Robust Dialog Policies for Conversational Error Recovery," Amazon Machine Learning Conference (AMLC), 2018 (Oral - 4% acceptance rate)

  21. P. K. Bodigutla, Y. Yan, S. Joshi, A. Geramifard, "Sentiment Analysis in Human-Machine Interaction", Amazon Machine Learning Conference (AMLC), 2017

  22. J. Casale, J. Cao, S. Li, A. Geramifard, "Toward a Conversational Bot - MovieBot", Amazon Machine Learning Conference (AMLC), 2017

  23. M. Fazel-Zarandi, S. Li, J. Cao, J. Casale, D. Whitney, and A. Geramifard, “Learning Robust Dialog Policies in Noisy Environments”, Workshop on Conversational AI Today’s Practice Tomorrow’s Potential, NeurIPS, 2017 [paper]

  24. A. Geramifard, C. Dann, R. H. Klein, W. Dabney, J. P. How, “RLPy: A Value-Function-Based Reinforcement Learning Framework for Education and Research”, Journal of Machine Learning Research, JMLR, 2015 [paper]

  25. S. Ponda, L. B. Johnson, A. Geramifard, J. P. How, “Cooperative Mission Planning for Multi-UAV Teams”, in the Handbook of Unmanned Aerial Vehicles, Chapter 16, Springer, 2014 [Springer Link] [Amazon Link]

  26. A. Geramifard, T. Walsh, S. Tellex, G. Chowdhary, N. Roy, J. P. How, “A Tutorial on Linear Function Approximators for Dynamic Programming and Reinforcement Learning”, Foundations and Trends in Machine Learning (FTML), 2013 [paper]

  27. T. Campbell, R. Klein, A. Geramifard, J. How, “Simultaneous Clustering on Representation Expansion for Learning Multimodel MDPs”, The 1st Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM), 2013 [paper]

  28. A. Geramifard, C. Dann, J. How, “Off-Policy Learning Combined with Automatic Feature Expansion for Solving Large MDPs”, The 1st Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM), 2013 [paper]

  29. C. Amato, G. Chowdhary, A. Geramifard, K. Ure, “Decentralized Control of Partially Observable Markov Decision Processes”, The 52nd IEEE Conference on Decision and Control (CDC), 2013 [paper]

  30. A. Geramifard, T. Walsh, N. Roy, and J. How, “Batch iFDD: A Scalable Matching Pursuit Algorithm for Solving MDPs”, The Conference on Uncertainty in Artificial Intelligence (UAI), 2013 [31% acceptance] [paper]

  31. J. Joseph, A. Geramifard, W. Roberts, J. How and N. Roy,  “Reinforcement Learning with Misspecified Model Classes”, IEEE International Conference on Robotics and Automation (ICRA), 2013 [39% acceptance][nominated for Best Cognitive Robotics Paper Award] [paper]

  32. J. Joseph, A. Geramifard, J. How and N. Roy,  “Reinforcement Learning with Misspecified Bayesian Nonparametric Model Classes”, Workshop on Bayesian Nonparametric Models For Reliable Planning And Decision-Making Under Uncertainty, NeurIPS, 2012 [paper]

  33. A. Geramifard, J. Redding, J. P. How, “Intelligent Cooperative Control Architecture: A framework for performance improvement using safe learning”, Journal of Intelligent and Robotic Systems (JIRS), Springer, 2012 [paper]

  34. K. Ure, A. Geramifard, G. Chowdhary, J. P. How, “Adaptive Planning for Markov Decision Processes with Uncertain Transition Models via Incremental Feature Dependency Discovery”, European Conference on Machine Learning (ECML), 2012 [24% acceptance] [paper]

  35. A. Geramifard, S. Tellex, D. Wingate, N. Roy, and J. P. How, “A Bayesian Approach to Finding Compact Representations for Reinforcement Learning”, European Workshops on Reinforcement Learning (EWRL), 2012 [68% acceptance] [paper]

  36. A. Geramifard, J. Redding, J. Joseph, N. Roy, and J. P. How, “Model Estimation Within Planning and Learning”, American Control Conference (ACC), 2012 [55% acceptance] [paper]

  37. A. Geramifard, “Practical Reinforcement Learning Using Representation Learning and Safe Exploration for Large Scale Markov Decision Processes”, PhD thesis, Aeronautics and Astronautics Department, Massachusetts Institute of Technology, MA, Nov 2011 [thesis]

  38. A. Geramifard, J. Redding, J. Joseph, and J. P. How, “Model Estimation Within Planning and Learning”, Workshop on Planning and Acting with Uncertain Models, ICML, Bellevue, WA, USA, 2011 [paper]

  39. A. Geramifard, F. Doshi, J. Redding, N. Roy, and J. P. How, “Incremental Feature Dependency Discovery”, Proceedings of the 23rd International Conference on Machine Learning (ICML), 2011, [28% acceptance] [paper] [slides] [poster] [video Unfortunately slides are not synched correctly. Please download them separately.]

  40. J. Redding, T. Toksoz, N. K. Ure, A. Geramifard, J. P. How, M. A. Vavrina, and J. Vian, “Distributed Multi-Agent Persistent Surveillance and Tracking With Health Management”, in AIAA Guidance, Navigation, and Control Conference (GNC), 2011 [paper]

  41. A. Geramifard, J. Redding, N. Roy, and J. P. How, “UAV Cooperative Control with Stochastic Risk Model”, American Control Conference (ACC), 2011 [paper] [bibtex] [slides]

  42. J. Redding, A. Geramifard, H.-L. Choi, and J. P. How, “Actor-critic policy learning in cooperative planning,” in AIAA Guidance, Navigation, and Control Conference (GNC), 2010 [paper][slides]

  43. J. Redding, A. Geramifard, A. Undurti, H. Choi, and J. How, “An intelligent Cooperative Control Architecture,” in American Control Conference (ACC), 2010 [paper]

  44. J. Redding, A. Geramifard, J. How “Actor-Critic Policy Learning in Cooperative Planning”, Embedded Reasoning: Intelligence in Embedded Systems, AAAI Symposium, 2010 [paper]

  45. R. He, A. Bachrach, M. Achtelik, A. Geramifard, D. Gurdan, S. Prentice, J. Stumpf, N. Roy, “On the Design and Use of a Micro Air Vehicle to Track and Avoid Adversaries”, International Journal of Robotics Research (IJRR), 2008 [paper]

  46. A. Bachrach, A. Geramifard, D. Gurdan, R. He, S. Prentice, J. Stumpf, N. Roy, “Co-ordinated Planning Under Uncertainty with Air and Ground Vehicles”, Proceedings of the 11th International Symposium on Experimental Robotics (ISER), 2008. [paper] 

  47. R. Sutton, Cs. Szepesvári, A. Geramifard, and M. Bowling, “Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping”, Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence (UAI), pages 528-536, 2008. [28% acceptance] [paper] [slides]

  48. M. Bowling, A. Geramifard, D. Wingate, “Sigma Point Policy Iteration”, Proceedings of the 7th International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS), pages 379-386, 2008. [22% acceptance] [paper] 

  49. A. Geramifard, “Incremental Least-Squares Temporal Difference Learning”, Master thesis, Computing Science Department, University of Alberta, Edmonton, AB, Canada, January 2007 [thesis] [slides] [poster]

  50. A. Geramifard, M. Bowling, M. Zinkevich, R. Sutton, “iLSTD: Eligibility Traces & Convergence Analysis ”, In B. Schölkopf and J.C. Platt and T. Hofmann editors, Advances in Neural Information Processing Systems 19 (NIPS), pages 440-448, 2007.  [24% acceptance] [paper] [slides]

  51. A. Geramifard, M. Bowling, R. Sutton, “Incremental Least-Square Temporal Difference Learning”, Proceedings of 21t Conference, American Association for Artificial Intelligence (AAAI), pages 356-361, 2006. [30% acceptance] [paper] [slides] [poster]

  52. A. Geramifard, P. Chubak, V. Bulitko, “Biased Cost Pathfinding”, Proceedings of second Conference, Artificial Intelligence and Interactive Digital Entertainment (AIIDE) 2006. [73% acceptance] [paper][poster]

  53. A. Geramifard, P. Nayei, R. Zamaninasab, J. Habibi, "A Hybrid Three Layer Architecture for Fire Agent Management in Rescue Simulation Environment", International Journal of Advanced Robotic Systems, Vol 2,No 2, June 2005 [paper]

  54. A. Nouri, R. Zamani-Nasab, J. Habibi, A. Geramifard, "Task Allocation in Complex Multiagent Systems with Parallel Scheduling ", Workshop on Information Technology & its Disciplines, Kish Island, Iran, February 2004

  55. J. Habibi, M. Ahmadi, A. Nouri, M. M. Nevisi, A. Geramifard, P. Nayeri, M. Sayyadian, H. Khaleghi, M. Motamed, R. Zamaninasab, "Arian Agents: A Set of Implemented Agents for RoboCup Rescue Simulation Environment", In Proceedings of the RoboCup Symposium, Padova, Italy, 2003 [paper]

  56. A. Geramifard, P. Nayeri, "Implementation of Fire Agent & Fire Station in the Rescue Multi-Agent Environment", BSc thesis, Computer Engineering Department, Sharif University of Technology, Tehran, Iran, September 2003 [thesis]

Google Sites
Report abuse
Page details
Page updated
Google Sites
Report abuse