Rishika Bhagwatkar
I am a research master student at Mila and UdeM, Montreal under the supervision of Prof. Irina Rish. My main research interests are multimodal representation learning and continual learning.
Previously, I was a research intern at ALMAnaCH, Inria, Paris under the supervision of Dr. Djamé Seddah. My work was focused on studying the interactions of various modalities in real-time game sessions.
I also worked on the conjunction of self-supervised and continual learning with Prof. Christopher Kanan at the Rochester Institute of Technology, New York. As a DAAD WISE Scholar, I worked on appraisal-based emotion recognition from social media data under Dr. Roman Klinger and Dr. Carina Silberer at the University of Stuttgart.
I completed my bachelor's [thesis] in Electronics and Communication Engineering from the Visvesvaraya National Institute of Technology, India. Also, I am actively mentoring projects on understanding and improving language (and multimodal) representations at IvLabs.
Besides research, I like to spend my time quilling, reading and visiting new places.
News
[Oct 2023] Our paper on Lag-Llama: Towards Foundation Models for Time Series Forecasting was accepted at the NeurIPS R0-FoMo Workshop 2023.
[Sept 2023] Received Université de Montréal Exemption Scholarship worth 10,000 CAD.
[Sept 2023] Started my research masters at Mila.
[Apr 2022] Our paper on Contrastive Learning-Based Domain Adaptation for Semantic Segmentation was accepted at NCC 2022.
[May 2022] Our paper on Challenges in scene understanding for autonomous systems was accepted at AIR 2022.
[Mar 2022] Awarded the Charpak Lab Scholarship for my research on multimodal data in game sessions at ALMAnaCH, Inria, Paris.
[Oct 2021] Our paper on Enhancing Context Through Contrast was accepted at the NeurIPS 2022 Pre-registration workshop.
Publications
Contrastive Learning-Based Domain Adaptation for Semantic Segmentation
National Conference on Communications (NCC), 2022
R. Bhagwatkar, S. Kemekar, V. Domatoti, K. Khan, A. Singh
In this work we hypothesize that real-world images and their corresponding synthetic images are different views of the same abstract representation. To enhance the quality of domain-invariant features, we increase the mutual information between the two inputs.
Challenges in scene understanding for autonomous systems
International Conference on Advancements in Interdisciplinary Research (AIR), 2022
R. Bhagwatkar, S. Kemekar, V. Domatoti, K. Khan, A. Singh
In this work, we present various limitations and drawbacks faced by current autonomous pipelines along with solutions to mitigate the same.
Enhancing Context Through Contrast
NeurIPS 2021 Workshop on Pre-registration in Machine Learning
K. Ambilduke, A. Shetye, D. Bagade, R. Bhagwatkar, K. Fitter, P. Vagdargi, S. Chiddarwar
We posit that languages are linguistic transforms that map abstract meaning to sentences. We attempt to extract and investigate this abstract space by optimizing the Barlow Twins objective between latent representations of parallel sentences.
Paying Attention to Video Generation
NeurIPS 2020 Workshop on Pre-registration in Machine Learning, PMLR 148:139-154, 2021
R. Bhagwatkar, K. Fitter, S. Bachu, A. Kulkarni, S. Chiddarwar
Just like sentences are series of words, videos are series of images. Inspired by the success of large language models in predicting language, we attempt to generate videos using a GPT and a novel Attention-based Discretized Autoencoder.
A Review of Video Generation Approaches
International Conference on Power, Instrumentation, Control and Computing (PICC), 2020
R. Bhagwatkar, K. Fitter, S. Bachu, A. Kulkarni, S. Chiddarwar
In this work we study and discuss several approaches for generating videos, either using Generative Adversarial Networks (GANs) to sequential models like LSTMs. Further, we compare the strengths and weakness of each approach with the underlying motivation to provide a broad and rigorous review on the subject.
Projects
Medical VQA
Deployed various Visual Question Answering models on medical datasets.
Improved Facebook AI Research’s MMF framework for medical data.
Achieved leaderboard performance on the ImageCLEF-2019 dataset.
Video Generation
Aimed at generating entire frames and not pixel-level predictions.
Developed a novel Attention Based Discretized Autoencocder (ADAE).
Coupled the ADAE with a GPT-2 for video generation.
Neural Machine Translation
Language Modelling
Generated Dinosaur names using Character-level RNNs.
Developed a paragraph generator to generate text from Harry Potter novels.
Implemented RNNs from scratch and compared performance with and amongst different inbuilt RNN modules using PyTorch.
Variational Deep Learning
Studied and implemented various autoencoders and generative networks.
Developing variational models for multimodal applications, mainly sequential multimodal data like electroencephalography signals.
Landmark Retrieval
Aimed at extracting images of landmarks similar to a query image.
Designed a ResNet-101 based autoencoder for the above task on “Google’s Landmark Dataset-v2” using TensorFlow.
Real-time Digit Classifier
Developed an open-source pipeline for human-computer interaction using Deep Learning and Computer Vision for digit classification.
Trained Convolutional and Deep Neural Networks from scratch.
Achieved 99% accuracy on the MNIST Dataset in real-time.
Detection & Tracking
Aimed at object detection and tracking from high altitude aerial vehicles.
Optimized the pipeline to deliver real-time performance with human accuracy.