As part of an academic NLP project, my team built an intelligent chatbot using the RAG-Instruct dataset.
This project aims to explore the full pipeline of modern NLP systems — from data preprocessing and analysis to training transformer-based models for answering diverse, information-rich queries. We applied techniques such as document clustering, Word2Vec embeddings and fine-tuning models and rerankers on question-document-answer triples. The result is a chatbot capable of understanding context-rich questions and generating relevant, informed answers by grounding responses in retrieved Wikipedia based documents.
This project aims to predict pest incidence in crops by integrating satellite imagery (Sentinel) and weather data. Collaborated with the NGO WOTR to process and clean 150,000+ datapoints, reducing noise and optimizing for binary classification (pest/no-pest). Achieved a 3% improvement in model accuracy by incorporating geospatial features, with XGBoost and Random Forest emerging as top-performing models. This project supported precision agriculture efforts in rural India by enabling early intervention and improving crop resilience.
In this paper, we conduct a comparative analysis of four leading large language models—GPT-4, GPT-3.5, Google Gemini, and Meta LLaMA—for automated UML class diagram generation from textual descriptions. Using a curated dataset and consistent evaluation metrics, we assess each model’s accuracy, clarity, and domain understanding. Our results reveal distinct strengths and weaknesses across the models, with ChatGPT 3.5 achieving the most reliable performance. This study offers practical guidance for leveraging LLMs in software engineering tasks. You can request to read the paper here.
At HackUPC 2024 in Barcelona, we developed Musafir, a Python library designed to enhance business travel for TravelPerk users. It simplifies logistics with features like fastest flight recommendations, tailored hotel stays, and networking opportunities. This project demonstrates skills in Python programming, API integration, data analysis, and user experience optimization. You can access our project here.
At the UPC Startup Challenge, we developed Griham, a platform designed to facilitate co-housing arrangements by connecting families, elderly individuals, couples, and singles with available space to those seeking shared accommodations. I created the entire website design from scratch using Figma. Griham offers affordable living options, fosters community interaction, and provides flexibility in housing arrangements. Tenants can reduce rent by contributing to household chores, creating a cooperative and shared responsibility environment. You can access the project here.