Publications (in ICML, ICLR, AAAI, NAACL, EMNLP, COLING, ICASSP, etc.)

Federated Continual Learning for Text Classification via Selective Inter-client Transfer

In EMNLP 2022 (findings) | PDF

Acceptance rate:  - Author(s): Yatin Chaudhary, Pranav Rai, Matthias Schubert, Hinrich Schütze and Pankaj GuptaKeywords: Federated Learning, Continual Learning, Federated Continual Learning, Federated Natural Language Processing, Edge AI, etc.TL;DR:  Federated Continual Learning for Natural Language processing applied to Text Classification with knowledge transfer across clients without sharing data 

Multi-source Neural Topic Modeling in Multi-view Embedding Spaces

In conference proceedings of NAACL 2021. | pdf

Acceptance rate:  - 26% (477/1797)Author(s): Pankaj Gupta*, Yatin Chaudhary*, Hinrich SchützeKeywords: Neural topic model, natural language processing,  transfer learning, word embeddings, latent topics, unsupervised deep learning, etc.TL;DR:  Jointly leverage  word  and latent topic embeddings from one or many sources addressing data sparsity  and polysemy issues in topic models 

TopicBERT for Energy Efficient Document Classification

In EMNLP 2020 (findings) | PDF 

Acceptance rate22.4% for EMNLP and a further 15.5% for Findings Author(s): Yatin Chaudhary*, Pankaj Gupta*, Khushbu Saxena, Vivek Kulkarni, Thomas Runkler and Hinrich SchützeKeywords: Natural Language Processing,  Efficient BERT, Efficient Text classification, ExplainableAI in BERT-based modelsTL;DR:  Production-friendly and Environment-friendly BERT models: TopicBERT and TopicDistilBERT. Optimize computational cost for document classification, Reduce number of self-attention operations. Achieve a 1.4x (40%) speedup with 40%reduction in CO2 emission while retaining 99.9% performance.

Explainable and Discourse Topic-aware Neural Language Understanding 

In conference proceedings of ICML 2020 | PDF

Acceptance rate:  21.8% (1088/4990)Author(s): Yatin Chaudhary, Hinrich Schütze, Pankaj GuptaKeywords: Natural Language Processing,  Topic and discourse aware Representation learning for long documentsTL;DR:  A framework to improve document representation with topics at both document and sentence level, complementary learning with explainable topic terms 

Neural Topic Modeling with Continual Lifelong Learning

In conference proceedings of ICML 2020 | PDF

Acceptance rate:  21.8% (1088/4990)Author(s): Pankaj Gupta, Yatin Chaudhary, Thomas Runkler, Hinrich SchützeKeywords: Lifelong Continual Machine Learning, Neural topic model, Natural Language Processing,  Transfer Learning, Document representation learning, etc.TL;DR:  A framework to perform improved topic modeling within Lifelong Learning paradigm

Doctoral (PhD) Thesis: Neural Information Extraction from Natural Language Text 

Doctoral PhD Thesis. 2019. | pdf

Grade:  1.0 (Magna cum laude) Author(s): Pankaj Gupta  |  Advisor: Prof. Dr. Hinrich Schütze   |  Examiner(s):    Dr. Ivan Titov and Dr. William Yang WangKeywords: Information Extraction, Relation Extraction, Named Entity Extraction, Neural Topic Modeling, Document Representation, Neural Networks, Textual Similarity, Interpretability of Neural Networks , Natural language processing,  Transfer learning, Unsupervised deep learning, Generative modeling, Neural Composite modeling, etc.TL;DR:  Neural Information extraction within and across sentence boundaries using supervised and semi-supervised paradigm; Neural Topic Modeling with Transfer Learning addressing data sparsity.  

Neural Architectures for Fine-Grained Propaganda Detection  in News

In EMNLP-IJCNLP 2019 workshop on NLP4IF, Hong Kong  | pdf

Author(s): Pankaj Gupta, Khushbu Saxena, Usama Yaseen, Thomas Runkler, Hinrich SchützeKeywords:  Shared task on fine-grained analysis of propaganda in news articles,  Linguistic, Layout and topical features, BERT and multi-granularity LSTM-CRF models to joint perform sentence and fragment level propaganda detectionTL;DR:  Rank: 3rd (out of 12 participants) in fragment level and 4th (out of 25 participants) in sentence level propaganda detection tasks.

BioNLP-OST 2019 RDoC Tasks: Multi-grain Neural Relevance Ranking Using Topics and Attention Based Query-Document-Sentence Interactions 

In EMNLP-IJCNLP 2019 workshop on BioNLP-OST , Hong Kong | pdf

Author(s): *Yatin Chaudhary, *Pankaj Gupta, Hinrich Schütze (* --> equal contribution)Keywords:  Neural Relevance Ranking, Document and Sentence Ranking, PubMed corpus,  BioNLP shared Tasks, Winning solutionsTL;DR:  TWO Winning Solutions for Document and sentence relevance ranking tasks in Bio-medical data sets using topic modeling and re-ranking

Linguistically Informed Relation Extraction and Neural Architectures for Nested Named Entity Recognition in BioNLP-OST 2019 

In EMNLP-IJCNLP 2019 workshop on BioNLP-OST, Hong Kong | pdf 

Author(s): *Usama Yaseen, *Pankaj Gupta, Hinrich Schütze (* --> equal contribution)Keywords:  Named Entity Recognition and Normalization, Relation extraction, Neural Architectures. BioNLP shared Tasks, Winning solutionsTL;DR:  TWO Winning Solutions for (nested) NER and  relation extraction in Bio-medical  English  and Spanish texts 

textTOvec: Deep Contextualized Neural Autoregressive Topic Models of Language with Distributed Compositional Prior

In conference proceedings of ICLR 2019, New Orleans USA  |  pdf code

Acceptance rate: 31%Author(s): Pankaj Gupta, Yatin Chaudhary, Florian Buettner, Hinrich SchützeKeywords: Neural topic model, natural language processing, text representation, language modeling, information retrieval, deep learningTL;DR: Unified neural model of topic and language modeling to introduce language structure in topic models for contextualized topic vectors

Document Informed Neural Autoregressive Topic Models with Distributional Prior

In conference proceedings of AAAI 2019, Honolulu Hawaii USA  |  pdf code talk poster

Acceptance rate: 16%Author(s): Pankaj Gupta, Yatin Chaudhary, Florian Buettner, Hinrich SchützeKeywords: Neural topic model with word embeddings, natural language processing, text representation, language modeling, information retrieval, deep learningTL;DR:  Informed topic model with word embeddings

Neural Relation Extraction Within and Across Sentence Boundaries

In conference proceedings of AAAI 2019, Honolulu Hawaii USA  |  pdf code talk poster

Acceptance rate: 16%Author(s): Pankaj Gupta, Subburam Rajaram, Thomas Runkler, Hinrich SchützeKeywords: Relation Extraction, Dependency-based Neural Networks, Recurrent Neural Networks, Recursive composition, tree-RNN, Bio-medical shared taskTL;DR:  Novel neural network based architecture using Dependency features for relationships spanning sentence boundaries

LISA: Explaining Recurrent Neural Network Judgments via Layer-wIse Semantic Accumulation and Example to Pattern Transformation

In EMNLP2018 workshop on BLACKBOXNLP2018, Brussels Belgium  | pdf poster

Author(s): Pankaj Gupta, Hinrich SchützeKeywords:  Explainable AI, Interpretability of  RNNs, Explaining RNN predictions, Blackbox Neural NetworksTL;DR:  Explaining predictions in RNNs, Identify key features in  relation classification , Layer-wise semantic accumulation in RNN

Replicated Siamese LSTM in Ticketing System for Similarity Learning and Retrieval in Asymmetric Texts

In COLING2018 workshop on SEMANTIC DEEP LEARNING (SEMDEEP-3), Santa Fe, New Mexico, USA  | pdf talk  

Author(s): Pankaj Gupta, Bernt Andrassy, Hinrich SchützeKeywords:  Industrial application of Similarity learning in Ticketing systemTL;DR:  Similarity learning in asymmetric texts and applied to retrieval tasks in Industrial Ticketing system 

Joint Bootstrapping Machines for High Confidence Relation Extraction

In conference proceedings of NAACL-HLT 2018, New Orleans USA  |  pdf code oral slides

Acceptance rate: 32%Author(s): Pankaj Gupta, Benjamin Roth, Hinrich SchützeKeywords:  Semi-supervised Bootstrapping, Relation Extraction, seeding mechanisms, pattern confidenceTL;DR:  Novel mechanism to bootstrap relation extractors using seed entities and patterns with word embeddings, Better estimation of pattern confidence 

Deep Temporal-Recurrent-Replicated-Softmax for Topical Trends over Time

In conference proceedings of NAACL-HLT 2018, New Orleans USA  |  pdf code oral slides

Acceptance rate: 32%Author(s): Pankaj Gupta, Subburam Rajaram, Hinrich Schütze, Bernt AndrassyKeywords:  Probabilistic Graphical models, Neural Topic Models, Topic over Time, Topic ModelingTL;DR:  Novel neural architecture to extract topic over time

Table Filling Multi-Task Recurrent Neural Network for Joint Entity and Relation Extraction

In conference proceedings of COLING 2016, Osaka Japan|  pdf code

Acceptance rate: 32%Author(s): Pankaj Gupta, Hinrich Schütze, Bernt AndrassyKeywords: Entity-relation extraction, NLP, Unified neural architecture, multi-tasking, joint modeling TL;DR:  Joint entity and relation extraction in a Table-filling scheme modeled in a unified multi-tasking neural architecture

Combining recurrent and convolutional neural networks for relation classification

In conference proceedings of NAACL 2016, San Diego USA  |  pdf

Acceptance rate: 25%Author(s): Ngoc Thang Vu, Heike Adel, Pankaj Gupta, Hinrich SchützeKeywords:  Relation classification, CNN, RNN, Ranking loss, Combining RNN and CNN TL;DR:  Novel Neural Architectures based on CNN and RNN for relation classification

Bi-directional recurrent neural network with ranking loss for spoken language understanding

In conference proceddings of ICASSP 2016, Shanghai China  |  pdf

Acceptance rate: 47%Author(s): Ngoc Thang Vu, Pankaj Gupta, Heike Adel, Hinrich SchützeKeywords:  Ranking loss, RNN, slot-fillingTL;DR:  Novel Neural architecture base d on RNN and ranking loss 

Heterogeneous ensembles for predicting survival of metastatic, castrate-resistant prostate cancer patients

In F1000RESEARCH 2016  |  pdf

Author(s): Sebastian Pölsterl, Pankaj Gupta, Lichao Wang, Sailesh Conjeti, Amin Katouzian, Nassir NavabKeywords:  Prostate Cancer analysis, Survival Analysis, SVM, Ensembling,  nested cross-validation

Prediction of overall survival for patients with metastatic castration-resistant prostate cancer

In the LANCET ONCOLOGY Journal, 2016

Author(s): the Prostate Cancer Challenge DREAM CommunityKeywords:  Prostate Cancer analysis, Survival Analysis

Deep Learning Methods for the Extraction of Relations in Natural Language Text

Master Thesis (2015) Report submitted in DEPARTMENT OF INFORMATICS TUM, Munich Germany|  pdf

Author(s): Pankaj GuptaAdvisor(s): Thomas Runkler, Heike Adel, Bernt Andrassy, Hans-Georg Zimmermann, Hinrich SchützeKeywords:  Relation Extraction, RNNs, Ranking loss, TAC KBP Slot Filling shared task , SemVal10

Keyword Learning for Classifying Requirements in Tender Documents

Master Guided Research (2015) Report submitted in DEPARTMENT OF INFORMATICS TUM Germany|  pdf

Author(s): Pankaj GuptaAdvisor(s): Thomas Runkler, Bernt AndrassyKeywords:  Probabilistic Graphical models, NLP, RBMs, Discriminative RBMs,  Hybrid training, pre-training and fine-tunning, Text classification, Industrial Texts

Identifying Patients with Diabetes using Discriminative Restricted Boltzmann Machines

Master Interdisciplinary Project (2015) submitted in DEPARTMENT OF INFORMATICS TUM  Germany  |  pdf

Author(s): Pankaj Gupta, Udhayaraj SivalingamAdvisor(s): Sebastian Pölsterl, Nassir NavabKeywords:  Probabilistic Graphical models,, RBMs, Discriminative RBMs,  Hybrid training, pre-training and fine-tunning, Text classification,  Survival Analysis, Prostate Cancer Analysis

Summarizing text by ranking text units according to shallow linguistic features

In conference proceedings of ICACT (2011), Seoul South Korea  |  pdf

Author(s): Pankaj Gupta, Vijay Shankar Pendluri, Ishant VatsKeywords:  Text Summarization, Lexical chains, Linguistic features,  sentence ranking