Master Thesis Work and Publications

VQA-GEN: Multimodal Dataset for VQA Domain Generalization (Master Thesis)

Developed first multimodal dataset for VQA domain generalization with 2M question-answer pairs. Introduced multimodal shifts to increase model generalization. Outperformed benchmarks like VQA-CP for cross-domain generalization.

UPREVE: An End-to-End Causal Discovery Benchmarking System

We propose UPREVE, a new benchmarking tool for causal discovery.

Enables customizable workflows to evaluate models.
Introduces auto preprocessing with customization.
Supports 15+ algorithms on 15+ datasets in R and Python.
Open-source and modular architecture.

Causal Data Fusion for Multi-modal Disaster Classification in Social Media

We propose a novel multi-modal causal data fusion mechanism, which models the relationships between modalities via causal graphs. For each causal graph, our framework proposes an approach to combine the modalities such that the integrated data only contain informative, non-redundant information for classification. We further propose to eliminate the need for relationships between modalities (thus the causal graphs) to be known a priori.

PFL(Prompt-based Few-Shot Learning)

Current VQA models require fine-tuning on target datasets, which is computationally expensive. Few-shot learning allows adapting models with limited target examples but still updates all parameters. Prompt-based learning further reduces computation by only updating a small subset of parameters. Enables efficient domain adaptation for VQA with lower resource requirements.

Grocery Dataset Recognition

● Given a grocery store shelf image, detect all products present in the
shelf image (detection only at product or no-product level)
● The assignment requires you to implement a single shot object
detector with only “one” anchor box per feature-map cell.
● Accuracy of at least 0.7 mAP on the test set.

Dataset:

Grocery Dataset (https://github.com/gulvarol/grocerydataset).

Here, I have Single shot detection algorithm to detect multiple sub images in an image. The reason for preferring it over YOLO is SSD generally gives high precision value.

SQLDataPrep4ML - SQL Data Preprocessing Library

Library providing implementation of Machine Learning data pre-processing functions which allows to process data directly in SQL database. As opposed to the traditional in-memory execution approach of traditional libraries such as SKLearn, the mechanism does not retrieve the data from the source but instead dynamically generates SQL queries and executes them in RDBMS where the data are stored.

The current version supports following RDBMS backends: IBM DB2 (LUW and Z) and PostgreSQL

Github: https://github.com/IBM/SqlDataPrep4ML

Simmelian-backbone

This software implemenets a Simmelian backbone network analysis as described by Nick, Lee, Cunningham, and Brandes in 2013, and by Nocaj, Ortmann, and Brandes in 2015. The method has an almost magical ability to untangle "hairball" networks by removing links that are not part of embedded relationships. The effect is apparent in this figure from the Nick, et al., paper showing a large Facebook friends network:

Page updated

Google Sites

Report abuse