Rima Hazra
Indian Institute of Technology Kharagpur
Indian Institute of Technology Kharagpur
Aug'2025 - New paper out! "AURA: Affordance-Understanding and Risk-aware Alignment Technique for Large Language Models" [PDF]
May'2025 - New paper out! "Attributional Safety Failures in Large Language Models under Code-Mixed Perturbations" [PDF]
May'2025 - New paper out! "MemeSense: An Adaptive In-Context Framework for Social Commonsense Driven Meme Moderation" [PDF]
Feb'2025 - New paper out! “Soteria: Language-Specific Functional Parameter Steering for Multilingual Safety Alignment" [PDF]
Feb'2025 - 🎯Paper accepted at NAACL Industry Track 2025!🎊 "Breaking Boundaries: Investigating the Effects of Model Editing on Cross-linguistic Performance" [PDF]
Jan'2025 - 🎯Paper accepted at NAACL Main 2025!🎊 "Navigating the Cultural Kaleidoscope: A Hitchhiker's Guide to Sensitivity in Large Language Models" [PDF][Code]
Dec'2024 - 🎉 Paper accepted at AAAI 2025 AI Alignment Track! 🎯 "SafeInfer: Context Adaptive Decoding Time Safety Alignment for Large Language Models" [PDF]
Nov'2024 - 🎉 Paper accepted at ICWSM 2025! 🎯 "How (un)ethical are instruction-centric responses of LLMs? Unveiling the vulnerabilities of safety guardrails to harmful queries" [PDF]
Nov'2024 - Guest Lectures! Recently conducted two lectures on AI and Safety Alignment (as a part of NLP course) at CSE, IIT Kharagpur.
Oct'2024 - New paper released! 🎊 "Navigating the Cultural Kaleidoscope: A Hitchhiker's Guide to Sensitivity in Large Language Models" [PDF][Code]
Oct'2024 - 🎉 Paper accepted at EMNLP 2024 Industry Track! 🎯 "Context Matters: Pushing the Boundaries of Open-Ended Answer Generation with Graph-Structured Knowledge Context" [PDF]
Sept'2024 - 🎉 Paper accepted at EMNLP 2024 Main! 🎯 "Safety Arithmetic: A Framework for Test-time Safety Alignment of Language Models by Steering Parameters and Activations" [PDF][Code]
2024 - New!🌟 Received the prestigious PaliGemma Academic Program GCP Credit Award! 🤗
2024 - 📣Recently delivered a talk on AI and Safety at ACM Summer School on Generative AI for Text 2024. Access all the materials here.
2024 - New paper released! 🎊 "Safety Arithmetic: A Framework for Test-time Safety Alignment of Language Models by Steering Parameters and Activations" [PDF][Code]
2024 - New paper released! 🎉 "Breaking Boundaries: Investigating the Effects of Model Editing on Cross-linguistic Performance" [PDF]
2024 - New paper released!🎯 "SafeInfer: Context Adaptive Decoding Time Safety Alignment for Large Language Models" [PDF]
2024 - New!🎉 Received the prestigious Microsoft Academic Partnership Grant (MAPG) 2024 in collaboration with Prof. Animesh Mukherjee from IIT Kharagpur. Our proposal is among just five selected across India! Congratulations to the team members!
2024 - Paper accepted at ECML PKDD 2024! "DistALANER: Distantly Supervised Active Learning Augmented Named Entity Recognition in the Open Source Software Ecosystem" [PDF][Code]
2024 - Paper accepted at ACL 2024! “Sowing the Wind, Reaping the Whirlwind: The Impact of Editing Language Models” [PDF]
2024 - New paper out! “How (un)ethical are instruction-centric responses of LLMs? Unveiling the vulnerabilities of safety guardrails to harmful queries” [PDF][Data]
2024 - New paper out! “Sowing the Wind, Reaping the Whirlwind: The Impact of Editing Language Models” [PDF]
2024 - New paper out! "Context Matters: Pushing the Boundaries of Open-Ended Answer Generation with Graph-Structured Knowledge Context" [PDF]
2024 - New paper out! "DistALANER: Distantly Supervised Active Learning Augmented Named Entity Recognition in the Open Source Software Ecosystem" [PDF]
2023 - New paper out! "Redefining Developer Assistance: Through Large Language Models in Software Ecosystem" [PDF]
2023 - Paper accepted to ASONAM 2023! "Duplicate Question Retrieval and Confirmation Time Prediction in Software Communities" [PDF]