A good decision is based on knowledge and not on numbers
-Plato
Introduction
I am Vikas, Senior Research Scientist at Samsung Research America, Mountain View. My research is broadly focused on NLP, NLG, multi-modal NLP and generative AI. Currently my work is focused on specializing large language models (LLMs) for a variety of applied NLP and multi-modal NLP tasks.
I completed my PhD in Information and Computer Sciences under two awesome faculties - Steven Bethard and Mihai Surdeanu. I completed my Bachelors in ECE from Visvesvaraya National Institute of Technology, Nagpur in 2016.
Please checkout the links below for my CV and google scholar page for up to date publications and professional experience.
The links to my profiles are mentioned below:
CV Google Schoar Semantic Scholar LinkedIn Twitter GitHub
Research Interest:
Pro-active conversation and dialogue systems using LLMs for improving user experience and engagement
Retreieval Augmented Generation (RAG) on complex multi-modal documents with LLMs
Complex and explainable multi-hop Question Answering (QA)
Secondary collaborative work on 1) multi-lingual LLMs and 2) AI security with LLMs
RECENT NEWS
Short conference paper accepted at SIGIR 2021 - Topic - Document retrieval and reading comprehension for low resource domains
Long conference paper accepted at NAACL 2021 - Topic - Explainable Multi-hop QA - If You Want to Go Far Go Together: Unsupervised Joint Candidate Evidence Retrieval for Multi-hop Question Answering.
Work Experience
Robert Bosch (Research and Technology Center) Sunnyvale, CA (May 2019 - Aug 2019)
Position: Conversational AI Research Intern, Tasks - Worked on natural language understanding of user queries in conversation AI products including entity recognition and disambiguation/normalization.
NVIDIA, Santa Clara, CA (May 2018 - Aug 2018)
Position: Deep Learning and NLP applied scientist intern, Tasks - Worked on information extraction from user reviews and feedbacks on NVIDIA products.
The University of Arizona, Tucson, AZ (Aug 2016 - Present)
Position: Research Assistant for DARPA WM projects, Tasks - Named Entity Recognition, Disambiguation, relation extraction, Question Answering, explainability in AI.
A brief description of my papers:
SIGIR 2020: Having Your Cake and Eating it Too: Training Neural Retrieval for Language Inference without Losing Lexical Match (Short conference paper), Acceptance rate = 30%
Vikas Yadav, Steven Bethard, Mihai Surdeanu
Preprint and description coming up soon...
ACL 2020: Unsupervised Alignment-based Iterative Evidence Retrieval for Multi-hop Question Answering (Long conference paper) , Acceptance rate = 25%
Vikas Yadav, Steven Bethard, Mihai Surdeanu
Preprint and description coming up soon...
EMNLP 2019: Quick and (not so) Dirty: Unsupervised Selection of Justification Sentences for Multi-hop Question Answering, (Long conference paper), Codes, Poster, Acceptance rate = 22%
Vikas Yadav, Steven Bethard, Mihai Surdeanu
We propose an unsupervised strategy for the selection of justification sentences for multi-hop question answering (QA) that (a) maximizes the relevance of the selected sentences,(b) minimizes the overlap between the selected facts, and (c) maximizes the coverage of both question and answer. This unsupervised sentence selection method can be coupled with any supervised QA approach. We show that the sentences selected by our method improve the performance of a state-of-the-art supervised QA model on two multi-hop QA datasets: AI2's Reasoning Challenge (ARC) and Multi-Sentence Reading Comprehension (MultiRC). We obtain new state-of-the-art performance on both datasets among approaches that do not use external resources for training the QA system: 56.82% F1 on ARC (41.24% on Challenge and 64.49% on Easy) and 26.1% EM0 on MultiRC. Our justification sentences have higher quality than the justifications selected by a strong information retrieval baseline, eg, by 5.4% F1 in MultiRC. We also show that our unsupervised selection of justification sentences is more stable across domains than a state-of-the-art supervised sentence selection method.
NAACL 2019: Alignment over Heterogeneous Embeddings for Question Answering, (Long conference paper), Codes, Oral Presentation, Acceptance rate = 23%
Vikas Yadav, Steven Bethard, Mihai Surdeanu
We propose a simple, fast, and mostly-unsupervised approach for non-factoid question answering (QA) called Alignment over Heterogeneous Embeddings (AHE). AHE simply aligns each word in the question and candidate answer with the most similar word in the retrieved supporting paragraph, and weighs each alignment score with the inverse document frequency of the corresponding question/answer term. AHE’s similarity function operates over embeddings that model the underlying text at different levels of abstraction: character (FLAIR), word (BERT and GloVe), and sentence (InferSent), where the latter is the only supervised component in the proposed approach. Despite its simplicity and lack of supervision, AHE obtains a new state-of-the-art performance on the “Easy” partition of the AI2 Reasoning Challenge (ARC) dataset (64.6% accuracy), top-two performance on the “Challenge” partition of ARC (34.1%), and top-three performance on the WikiQA dataset (74.08% MRR), outperforming many other complex, supervised approaches. Our error analysis indicates that alignments over character, word, and sentence embeddings capture substantially different semantic information. We exploit this with a simple meta-classifier that learns how much to trust the predictions over each representation, which further improves the performance of unsupervised AHE.
SIGIR 2018: Sanity check: A strong alignment and information retrieval baseline for question answering, (Short conference paper), Codes, Poster, , Acceptance rate = 29%
Vikas Yadav, Rebecca Sharp, Mihai Surdeanu
While increasingly complex approaches to question answering (QA) have been proposed, the true gain of these systems, particularly with respect to their expensive training requirements, can be inflated when they are not compared to adequate baselines. Here we propose an unsupervised, simple, and fast alignment and information retrieval baseline that incorporates two novel contributions: a one-to-many alignment between query and document terms and negative alignment as a proxy for discriminative information. Our approach not only outperforms all conventional baselines as well as many supervised recurrent neural networks, but also approaches the state of the art for supervised systems on three QA datasets. With only three hyperparameters, we achieve 47% on an 8th grade Science QA dataset, 32.9% on a Yahoo! answers QA dataset and 64% MAP on WikiQA.
COLING 2018: A survey on recent advances in named entity recognition from deep learning models, (Long conference paper), Codes, Poster, Acceptance rate = 35%
Vikas Yadav, Steven Bethard
Named Entity Recognition (NER) is a key component in NLP systems for question answering, information retrieval, relation extraction, etc. NER systems have been studied and developed widely for decades, but accurate systems using deep neural networks (NN) have only been introduced in the last few years. We present a comprehensive survey of deep neural network architectures for NER, and contrast them with previous approaches to NER based on feature engineering and other supervised or semi-supervised learning algorithms. Our results highlight the improvements achieved by neural networks, and show how incorporating some of the lessons learned from past work on feature-based NER systems can yield further improvements.
NLP Publications Google Scholar
Alignment Based Iterative Retrieval for Multi-hop Question Answering, ACL 2020, Long conference paper
Quick and (not so) Dirty: Unsupervised Selection of Justification Sentences for Multi-hop Question Answering, EMNLP-IJCNLP 2019, Long conference paper PDF
Alignment over Heterogenous Embeddings, NAACL 2019, Long conference paper, PDF
Sanity Check: A Strong Alignment and Information Retrieval Baseline for Question Answering, SIGIR 2018, Short conference paper, PDF
A Survey on Recent Advances in Named Entity Recognition from Deep Learning models, COLING 2018, Long conference paper, PDF
Deep Affix Features Improve Neural Named Entity Recognizers, *SEM SIGNLP 2018, Short conference paper, PDF
UArizonaIschool System at MADE1.0 NLP Challenge, CEUR, MADE 1.0 workshop, PDF