Saptarshi Ghosh
Associate Professor
Department of Computer Science and Engineering
Indian Institute of Technology Kharagpur
Kharagpur - 721302, West Bengal, India.
[Head, Max Planck Partner Group]
Email:
saptarshi @ cse . iitkgp . ac . in
saptarshi . ghosh @ gmail . com
Brief Bio
I am an Associate Professor at the Department of Computer Science and Engineering, Indian Institute of Technology (IIT) Kharagpur. My primary research interests are in Legal data analytics, Social network analysis and Algorithmic bias and fairness. My research is inter-disciplinary and uses techniques from Machine Learning, Natural Language Processing, Information Retrieval, Computational Social Science, and Complex Network Theory. I head a Max Planck Partner Group at IIT Kharagpur, that focuses on algorithmic bias and fairness.
I received my PhD in Computer Science from IIT Kharagpur in 2013. I was a Humboldt Post-doctoral research fellow at the Max Planck Institute for Software Systems (MPI-SWS), Germany.
Profiles: Google scholar | Scopus | DBLP | LinkedIn | Semantic Scholar
Note about internships: I am unable to offer summer/winter internships to students from outside IIT Kharagpur. Apologies for not being able to personally reply to the numerous emails about internships.
News
Our Language Model pre-trained on Indian legal text -- InLegalBERT -- reaches 270,000+ downloads !!
Paper accepted at ACL 2024 Main conference: IL-TUR: Benchmark for Indian Legal Text Understanding and Reasoning
Awarded Gemma Academic Program GCP Credit Award. Thanks Google for the support !!
Paper accepted at AAAI/ACM Conference on AI, Ethics and Society: Breaking the Global North Stereotype: A Global South-centric Benchmark Dataset for Auditing and Mitigating Biases in Facial Recognition Systems
Paper accepted at AAAI/ACM Conference on AI, Ethics and Society: Sponsored is the New Organic: Implications of Sponsored Results on Quality of Search Results in the Amazon Marketplace
Paper accepted in Artificial Intelligence and Law journal, Springer: Applicability of Large Language Models and Generative Models for Legal Case Judgement Summarization
Paper accepted at CSCW 2024: Investigating Nudges toward Related Sellers on E-commerce Marketplaces: A Case Study on Amazon
Organizing the ICPR2024 Competition on Multilingual Claim-Span Identification. Winning teams stand to win prizes and certificates [competition website]
Paper accepted at SIGIR 2024 (Resource & Reproducibility track): Legal Statute Identification: A Case Study using State-of-the-Art Datasets and Methods
Paper accepted at SIGIR 2024 (short): Instruction-Guided Bullet Point Summarization of Long Financial Earnings Call Transcripts
Project "NyayKosh: Multilingual Resources for AI-based Legal Analytics" sanctioned for funding by AI4ICPS Hub Foundation, IIT Kharagpur. Thanks for the support !!
Reached 5000 citations (according to Google Scholar)
Paper accepted at ACM Transactions on the Web: MuLX-QA: Classifying Multi-Labels and Extracting Rationale Spans in Social Media Posts
Paper accepted at NAACL 2024: Beyond Borders: Investigating Cross-Jurisdiction Transfer in Legal Case Summarization
Paper accepted at NAACL 2024: Parameter-Efficient Instruction Tuning of Large Language Models For Extreme Financial Numeral Labelling
Paper accepted at CVPR 2024: Convolutional Prompting meets Language Models for Continual Learning
Paper accepted at ICWSM 2024: How COVID-19 has Impacted the Anti-Vaccine Discourse: A Large-Scale Twitter Study Spanning Pre-COVID and Post-COVID Era
Elevated to IEEE Senior Member
Co-editing a Special Issue of the Artificial Intelligence & Law journal on "Applications and Evaluation of Large Language Models in the Legal Domain".
Paper accepted at EMNLP 2023: MILDSum: A Novel Benchmark Dataset for Multilingual Summarization of Indian Legal Case Judgments (with Debtanu Datta, Shubham Soni, Rajdeep Mukherjee)
Paper accepted at ICCV 2023: Exemplar-Free Continual Transformer with Convolutions (with Anurag Roy, Vinay Verma, Sravan Voonna, Kripabandhu Ghosh, Abir Das)
Paper accepted at ICDAR 2023: TransDocAnalyser: A framework for semi-structured offline handwritten documents analysis with an application to legal domain (with Sagar Chakraborty, Gaurav Harit)
Co-chairing the Artificial Intelligence on Social Media (AISoMe) data challenge with the FIRE2023 conference, comprising of a very challenging classification task on social media text - please participate !
Invited to speak at the Symposium on NLP for Social Good at University of Liverpool in June 2023, on NLP for the Legal domain.
Paper accepted at ICAIL 2023: Pre-trained Language Models for the Legal Domain: A Case Study on Indian Law (with Shounak Paul, Arpan Mandal, Pawan Goyal). Presents the first Transformer-based Language Models pre-trained on Indian legal text -- InLegalBERT, InCaseLawBERT, CustomInLawBERT -- an important resource for advancement of Legal NLP in India.
Congratulations Abhisek Dash for defending his PhD thesis titled "Bias and Fairness in Information Retrieval Algorithms on E-commerce Platforms" (co-supervised with Prof. Animesh Mukherjee). Abhisek is now a postdoc at MPI-SWS, Germany.
Organized the 3rd Symposium on AI and Law (SAIL2023) during February 24--26, 2023 as a hybrid event at IIIT Hyderabad. Details at https://sites.google.com/view/sail-2023/
Paper accepted in the Artificial Intelligence and Law journal: Ensemble methods for improving extractive summarization of legal case judgements (with Aniket Deroy and Kripabandhu Ghosh)
Delivered a tutorial on the application of AI in the legal domain at CODS-COMAD 2023, along with Jack Conrad, (Director and Lead Research Scientist), Shirsha Ray Chaudhuri (Director of Engineering) from Thomson Reuters, and Shounak Paul (PhD student) [Resource page]
Paper accepted at EMNLP 2022: ECTSum: A New Benchmark Dataset For Bullet Point Summarization of Long Earnings Call Transcripts. Congratulations Rajdeep !!
Paper accepted at AACL-IJCNLP 2022: Legal Case Document Summarization: Extractive and Abstractive Methods and their Evaluation. Congratulations Abhay, Paheli, Soham, Rajdeep !!
Our work on fairness in e-commerce search covered by the Data Skeptic -- podcast available here
Congratulations to Paheli for defending her PhD at IIT Kharagpur!! Thesis title: Text and Graph Processing Methods for Assisting Legal Practitioners
Honoured to be awarded the A K Singh Young Faculty Award at Assistant Professor level, by IIT Kharagpur (2022) [for contributions towards teaching, research and institutional development]
Paper accepted at the Information Processing and Management journal: Legal Case Document Similarity: You Need Both Network and Text. Congratulations Paheli !!
Congratulations Abhisek for getting a postdoctoral offer from the Max Planck Institute for Software Systems (MPI-SWS) !!
Congratulations Arpan for getting a postdoctoral offer from the University of Southampton !!
Presented our works on Law-AI to a global Thomson Reuters audience, as part of AI@TR Invited Speaker Series. Thanks Jack Conrad for the invite !!
Honoured to be appointed a Section Editor (on Legal Information Retrieval) for the prestigious Artificial Intelligence and Law journal.
Co-organized the 2nd Symposium on Artificial Intelligence and Law (SAIL 2022) during June 6-9, 2022, including talks by 8 eminent Law/AI researchers. Recordings available on YouTube.
Paper accepted in SIGIR 2022 Resource track: CAVES: A dataset to facilitate explainable classification and summarization of concerns towards COVID vaccines. Congratulations Soham, Azlaan, Rajdeep !!
Congratulations Arpan for defending his PhD at IIEST Shibpur (co-supervised with Prof. Sekhar Mandal)!! Thesis title: Catchphrases in Legal Case Documents: Identification and Applications
Paper accepted in IEEE Transactions on Computational Social Systems: FaiRIR: Mitigating Exposure Bias from Related Item Recommendations in Two-Sided Platforms. Congratulations Abhisek !!
Paper accepted in The Web Conference (formerly WWW) 2022: Alexa, in you, I trust! Fairness and Interpretability Issues in E-commerce Search through Smart Speakers. Congratulations Abhisek !!
Paper accepted in AAAI 2022: LeSICiN: A Heterogeneous Graph-based Approach for Automatic Legal Statute Identification from Indian Legal Documents. Congratulations Shounak !!
Paper accepted in ICWSM 2022: Winds of Change: Impact of COVID-19 on Vaccine-related Opinions of Twitter users. Congratulations Soham !!
Honoured to be appointed as the Head of a Max Planck Partner Group of MPI-SWS, Germany. Thanks Professor Krishna Gummadi for the support!!
Paper accepted in JURIX 2021: An Analytical Study of Algorithmic and Expert Summaries of Legal Cases. Congratulations Aniket, Paheli !!
Paper accepted in Artificial Intelligence and Law journal: Deep Learning for Rhetorical Role Labeling of Sentences in Legal Case Documents. Congratulations Paheli, Shounak !!
Co-organized the Indo-German Law-AI symposium (IGLAIS-2021) on September 29, 2021
Invited talk at the L3S Research Center, Leibniz University, Hannover, Germany on September 29, 2021 on our ongoing work on utilising social media to track opinions on COVID vaccines.
Our paper "Incorporating Domain Knowledge for Extractive Summarization of Legal Case Documents" awarded Donald Berman Award for Best Student Paper at International Conference on Artificial Intelligence and Law (ICAIL) 2021. Congratulations Paheli, Soham !!
Paper accepted in Artificial Intelligence and Law journal: A Sequence Labeling Model for Catchphrase Identification from Legal Case Documents. Congratulations Arpan !!
Co-organized the online Symposium on AI and Law (SAIL-2021) during May 31 - June 04, 2021. 5 days of exciting talks and panel discussions involving leading Law-AI researchers !!
PhD students Shounak Paul (co-supervised with Prof. Pawan Goyal) and Soham Poddar awarded Prime Minister’s Research Fellowship (PMRF), the most prestigious Government fellowship in India. Congratulations Shounak and Soham !!
Paper accepted in International Conference on Artificial Intelligence and Law (ICAIL) 2021: Incorporating Domain Knowledge for Extractive Summarization of Legal Case Documents. Congratulations Paheli and Soham !!
Congratulations Moumita for defending her PhD at IIEST Shibpur (co-supervised with Prof. Sipra DasBit)!! Thesis title: Utilizing Social Media for Post-Disaster Resource Allocation and Emergency Preparedness
Congratulations Shalmoli for defending her MS at IIT Kharagpur. Thesis: Utilizing Social Media for Health Analytics !!
Paper accepted in ACM FAccT 2021: When the Umpire is also a Player: Bias in Private Label Product Recommendations on E-commerce Marketplaces. Congratulations Abhisek !!
TCG CREST sponsors project: Smart Legal Consultant: AI-based Legal Analytics. Thanks for the support !!
Paper accepted in Workshop on Data Analytics for Smart Health (with IEEE BigData) 2020: Utilizing Social Media for Identifying Drug Addiction and Recovery Intervention. Congratulations Shalmoli !!
Paper accepted in Workshop on Fair and Interpretable Learning Algorithms (with IEEE BigData) 2020: Fairness for Whom? Understanding the Reader's Perception of Fairness in Text Summarization. Congratulations Anurag Shandilya, Abhisek !!
Paper accepted in COLING 2020: Automatic Charge Identification from Facts: A Few Sentence-Level Charge Annotations is All You Need. Congratulations Shounak !!
Paper accepted in CIKM 2020: ZSCRGAN: A GAN-based Expectation-Maximization Model for Zero-Shot Retrieval of Images from Textual Descriptions. Congratulations Anurag !!
M.Tech. thesis of Shounak Paul (joint student with Dr. Pawan Goyal) awarded Best M.Tech. Thesis of CSE Department, IITKGP. Congratulation Shounak !!
Our paper Identification of Rhetorical Roles of Sentences in Indian Legal Judgments awarded the JURIX 2019 Best Paper Award. Congratulations to Paheli, Shounak, Kripabandhu and Adam !! Media coverage: The Economic Times, Hindustan Times and India Today