Suraj has showcased his skills through diverse academic, professional, and personal projects, including open source contributions:
Emotion Recognition in Multi-party Conversation using Acoustic Context
• Modeled an RNN-based architecture with context fusion for emotion recognition in multi-party conversation. Observed 3%
improvement in weighted-F1 score with models using acoustic and previous emotion contexts over the non-contextualized models.
Multi-modal and Multi-Hop Information Retrieval on WebQA dataset
• Crafted a multi-modal visual-linguistic fusion architecture that used the CLIP model to extract textual and visual representations
(from image and image captions) and projected them into a common representation space. Defined a custom batching strategy that
generates in-batch negatives to bolster learning.
• Evaluated against two-stage unimodal re-ranking models and observed 56% improvement in Recall@100 with our fusion model.
Spatio-Temporal Causal Discovery Framework for Hydrology System
Worked on Spatio-Temporal Causal Discovery Framework, improving causal relationship identification by enforcing spatial and temporal constraints for water terrains. Collaborated with the product team of NSF to define the requirements.
Predicted flow rates across different years outperforming traditional causal models.
SQL Query Optimization [Open-Source with IBM] | scikit-learn, SQL
Collaborated with innovative in-database machine learning data preprocessing to eliminate unnecessary data movement. Decreased preprocessing time by 70% for large datasets.
Secure Hospital Management System
Built full stack web application using Django, MongoDB, AWS.
Developed blockchain payment gateway and chatbot for navigation. Hosted on AWS.
Real-time Event Detection (Open Source)
Analyzed Twitter data. Identified clusters using KMeans.
Visualized events in real-time with D3.js and Flask.
SQL Query Optimization (Open Source with IBM)
Developed in-database ML data preprocessing, reducing time by 70%.
Pushed computation into PostgreSQL/DB2, enabling 10x larger datasets.
Automation Report Tool
Mapped bugs and deployed on AWS using MERN stack.
Integrated with CommScope's MLISA tool.
High Level Automation Framework
Created Python scripts to interact with various devices.
Developed predefined functions using Robo Framework.
Person Re-Identification Model
Designed Pytorch model for CCTV-based person re-identification.
Achieved 92.4% efficiency on CROWDHUMAN dataset.