I am an Applied Scientist at Amazon Science. My research interest focuses on developing Machine Learning (ML) models for language understanding. This includes developing algorithms that can learn to read with minimal supervision and creating new applications to reduce human efforts in the long run.
Before joining Amazon, I received my Ph.D. from the Department of CSE at Ohio State University, with a focus on Natural Language Processing (NLP). I have developed multiple supervised and semi-supervised machine learning systems that are capable of extracting structured information from noisy user-generated texts.
Contact: https://www.linkedin.com/in/jeniyat/
Professional Experience
Open-sourced Machine Learning Framework
Contributed to the open sourced SageMaker Python SDK, that enables cloud deployment of ML models, by introducing new ML features.
Contributed the open sourced SageMaker Example Notebooks, that hosts illustrated-notebooks with SageMaker functionality, by authoring different notebooks depicting the workflow of different State-of-the-art machine learning models.
Research Projects
Fine Grained Entity Extraction from Software Text
Built the first ever software domain named-entity corpus with 15k+sentences
Proposed an embedding level attention with transformer as the NER model
Achieved F1 Score of 78.41 [21.6 increase over vanilla BERT]
Related Publications: Tabassum et al. ACL '20
Code/Data: https://github.com/jeniyat/StackOverflowNER
Entity and Relation Extraction from Wet Lab Protocol
Built an entity-relation corpus for the procedural texts from 700+ lab recipes
Organized a shared task at EMNLP ’20
Developed Ensemble models for both NER and RE
Achieved F1 Score of 76.84 for NER task
Achieved F1 Score of 81.32 for RE task [current SOTA]
Related Publications: Tabassum et al. EMNLP '20
Code/Data: https://github.com/jeniyat/WNUT_2020_NER, https://github.com/jeniyat/WNUT_2020_RE
User Profile Mining From Twitter
Modeled the spread of information through Tweets
Analyzed the tweets from 40M+ users and evaluate their profile alignment by analyzing human vs bot score
Developed the Twitter API tutorial for user profile mining
Code/Data: https://github.com/jeniyat/Twitter-Profiling
Learning Semantics from Software Social Network
Extracted proximity from the followers activity of 84M+ Github repositories
Proposed Repository Embeddings to evaluate similarities in repositories
Created repository embeddings by analyzing text content from repository-user network
Code/Data: https://github.com/jeniyat/Github-Repository-Embedding
Time Information Resolution from Tweets
Created Temporal Tagger to detect and normalize time expressions in tweets
The first ever distant supervision approach for the date resolution
Achieved F1 Score of 68.12 [17% increase over SUTIME]
Related Publications: Tabassum et al. ACL '17, Tabassum et al. EMNLP '16, Tabassum et al. MASC-SLL '16
Code/Data: https://github.com/jeniyat/TweeTime
Social Media on Disaster Response
Explored the impact of social media during a national disaster by analyzing the post about the Savar Tragedy
Proposed co-ordinated approach of relief distribution by mining-out repetitive post
Related Publications: Tabassum et al. WADM '13
Web Community Extraction
Proposed a novel extraction and ranking algorithm for web communities
Demonstrated improvement in auctions of a sponsored search market by utilizing the proposed algorithm
Related Publications: Salekin, Tabassum and Hasan, WIMS '13
Publications
Code and Named Entity Recognition in StackOverflow
[code][pdf][slides][demo][video]
Jeniya Tabassum, Mounica Maddela, Wei Xu and Alan Ritter
Proceedings of ACL 2020
WNUT-2020 Task 1 Overview: Extracting Entities and Relations from Wet Lab Protocols
Jeniya Tabassum, Sydney Lee, Wei Xu and Alan Ritter
Proceedings of EMNLP-WNUT 2020
Time Expression Resolution for Social Media Data [pdf]
Jeniya Tabassum, Alan Ritter and Wei Xu
Proceedings of ACL-WiNLP 2017
TweeTIME: A Minimally Supervised Method for Recognizing and Normalizing Time Expressions in Twitter
Jeniya Tabassum, Alan Ritter and Wei Xu
Proceedings of EMNLP 2016
Distant Supervision for Temporal Resolution [pdf]
Jeniya Tabassum and Alan Ritter
Proceedings of MASC-SLL 2016
Role of Social Media in Disaster Response in the Context of Savar Tragedy [pdf]
Jeniya Tabassum, Himel Dev, Mohammed Eunus Ali and Md. Fahim Abdullah
Proceedings of WADM 2013
Extract and Rank Web Communities [pdf]
Asif Salekin, Jeniya Tabassum, and Masud Hasan
Proceedings of WIMS 2013
Award/Honors
ACL D&I Scholarship for participating ACL 2020.
WiNLP Travel Award for participating ACL 2017.
Google Travel Award for attending Google NLU workshop 2017.
Student Travel Award for participating Grad Cohort 2016.
Dean's List Award for academic excellence in last three completed years, BUET.
University Merit Scholarship for academic excellence in all semesters, BUET .
Champion, Bangladesh National Math Olympiad, 2006.
Dhaka Education Board Scholarship for excellence in the HSC examination, '07-'11 (top 5%).
Dhaka Education Board Scholarship for excellence in the SSC examination, '04-'06 (top 5%).
Service
Program Committee/ Reviewer:
AAAI 2020
ACL 2019, 2020, 2021
AIRE Journal 2019, 2020
COLING 2018
ECNLP 2022
EMNLP 2019, 2020
HCC 2019
MASC-SLL, 2016
NAACL-SRW 2019, 2021, 2022
WiNLP 2020, 2022
WNUT 2020, 2019, 2018, 2017, 2016
Organizer
ACL-SRW 2018
NLP Speaker Series at OSU (2016-2018)
Panel Member
WIE, ICCIT 2016
Teaching
Senior Lecturer, CSE 5521: Artificial Intelligence II: Basic Topics ( Spring 2021)
Lecturer, CSE 3521: Artificial Intelligence I: Basic Topics (Autumn 2020)
Lecturer, CSE 3521: Artificial Intelligence I: Basic Topics (Spring 2020)
TA, CSE 5522: Artificial Intelligence II: Advanced Topics (Fall 2019)
Co-author of Twitter API Tutorial
Trainer, Bangladesh Math Olympiad Camp (Summer 2007)