Sourav Das
Hello!
Welcome to my website. I am a researcher based in Kolkata, India. I am involved in dedicated research work in the collective broad domains of Natural Language Processing, Language Generation, Information Retrieval, Large-Scale Event Analysis, and Machine Learning.
Ph.D. Scholar (ACM Anveshan Setu Ph.D. Fellow 2023)
Department of Computer Science and Engineering
Indian Institute of Information Technology Kalyani
(An Institute of National Importance)
West Bengal, India
Updates:
I have joined the EMNLP 2024 Industry Track as an Ethics Reviewer — August 2024.
Our paper AcKnowledge: Acquired Knowledge Representation by Small Language Model Without Pre-training has just been accepted at the ACL 2024 Workshop Towards Knowledgeable Language Models. — July 2024.
I have joined as a Program Committee Member for the 𝗘𝗠𝗡𝗟𝗣 𝟮𝟬𝟮𝟰 𝘄𝗼𝗿𝗸𝘀𝗵𝗼𝗽 𝗼𝗻 𝘁𝗵𝗲 𝟰𝘁𝗵 𝗜𝗻𝘁𝗲𝗿𝗻𝗮𝘁𝗶𝗼𝗻𝗮𝗹 𝗖𝗼𝗻𝗳𝗲𝗿𝗲𝗻𝗰𝗲 𝗼𝗻 𝗡𝗮𝘁𝘂𝗿𝗮𝗹 𝗟𝗮𝗻𝗴𝘂𝗮𝗴𝗲 𝗣𝗿𝗼𝗰𝗲𝘀𝘀𝗶𝗻𝗴 𝗳𝗼𝗿 𝗗𝗶𝗴𝗶𝘁𝗮𝗹 𝗛𝘂𝗺𝗮𝗻𝗶𝘁𝗶𝗲𝘀 (𝗡𝗟𝗣𝟰𝗗𝗛 𝟮𝟬𝟮𝟰) — June 2024.
My opinion on “Why India Needs to Build its Focus on AI Theory” is featured in Analytics India Magazine, the flagship Indian magazine on AI — June 2024.
I have joined as a Technical Program Committee member for the 3rd Congress on Smart Computing Technologies (CSCT 2024), organized by the National Institute Of Technology Sikkim in association with the Soft Computing Research Society India — May 2024.
My doctoral research summary:
Large language models often exhibit hallucinations and misinformation through the generated texts. Typically, most LLM-based chat agents cannot grasp changing contexts and topics to deliver expected responses with natural semantics. The knowledge represented by the language models is often non-dynamic or backdated, too. Also, there are very few LLMs as of now, optimized for the low-resource Indo-Aryan languages, especially Bengali, to represent local social thematic representations, conversational contexts, and exploring applications in various domains. My doctoral research exists in these broad areas to counter these issues and unify them to aim toward efficient language generation.
Find Me on: