Experiences

Work Experiences

Data Scientist/ Engineer at IBM: Chief Analytics Office

August '19 - Present

CAO(Chief Analytics Office) is an internal facing team where we solve business problems at the enterprise level with the help of data science techniques working together with Finance and Strategy.

Developing foundational assets ranging from entity matching to article scoring techniques.

Graduate Student Researcher at IESL, UMass Amherst

Sept '18 - April '19

IESL: Information Extraction and Synthesis Laboratory

Performing Importance sampling for on-demand crowd-sourcing and evaluating relation extraction performed on PubMed abstracts using Universal Schema Model.

Grader at UMass Amherst

Sept '18 - Present

Grading assignments and worksheets for graduate course : 585 Introduction to Natural Language Processing.

Research Intern at IBM Research: Thomas J. Watson Research Center

AI Solutions & Blockchain

June '18 - Aug '18

Dhanya Bhat, Aishwarya T, Srideepika Jayaraman, Ira Ceka, Twinkle Tanna

Enhancing Neural Relation Extraction Using Entity Features

I worked under the guidance of Anuradha Bhamidipaty, John Vergo, Justin Platz in the AI Solutions and Blockchain group. My work was in the field of NLP on unstructured data in the domain of mergers and acquisitions in order to build a knowledge base. We worked with pre-annotation tools such as Watson NLU and Watson Knowledge Discovery and also Figure8 for crowdsourcing data. Aimed to build a relationship extraction framework using piecewise CNNs. Enhanced the performance by adding additional features and changing the architecture in three different ways with the help of entity recognition. Also aimed to solve coreferences to improve performance.

Graduate Researcher at Oracle Labs

Jan '18 - May '18

Ashish Ranjan, Aaron Traylor, Neha Choudhary, Twinkle Tanna

As a part of Industry Mentorship Program under Prof. Andrew McCallum, I worked with Oracle Labs under the guidance of Micheal Wick and Jesse Lingeman on a classification project involving NLP and Biological data. Data involved was the Gene Ontology which was an ontology of genes and their relationships, annotation(details) of a gene and abstracts of the relevant journal paper. A field called 'Evidence Code' in the annotation helps biologists analyse the function of the gene and the background reasoning or proof as to how this inference was derived from the journal paper. It is imperative to know how conclusions are derived in this domain and it is also a tedious process to do manually given the amount of data that exists.

Our aim was to classify the annotation into these evidence codes given the other annotation details along with the abstract of the journal paper so as to automate this process of manual proof finding. We performed feature extraction on the details that we could derive from the ontology such as parent vectors. NLP techniques such as TFIDF, LSTM and Hierarchical Attention models were utilized to extract the information from the journal papers. We constructed an SVM and Neural Network multilabel classifiers to predict a field in the annotation called the evidence code.

Software Engineering Intern at Cisco

Jan '17 - May '17

Bikkumala Karthik, Twinkle Tanna

Under the guidance of Ather Sayeed Kanak and mentorship of Srinivas Gadamsetty, my teammate and I designed and developed a health monitoring system for Wireless LAN Controllers. Log data from the Wireless LAN Controller can be analysed to create visualisations that can help one identify faults in the systems clearly. A monitoring dashboard was developed with the help of Kibana. System logs were parsed with Logstash and stored on Elasticsearch. Kibana rendered the interactive visualizations with the aim to highlight significant fluctuations. We also integrated Elasticsearch with Apache Hive to support relational querying.

Software Engineering Intern at Cisco

May '16 - July '16

The Wide Area Network Services team worked on a product: WAAS Box which optimized the network traffic flowing through the various layers. Some optimizations were more fruitful than others and this could be optimized based on which data traffic was flowing at that time of the day. Under the guidance of Punarvasu Srigiriraju and mentorship of Vibhuti Shali, I developed a tool to analyze and predict network traffic in order to dynamically optimize the traffic. I performed statistical forecasting using ARIMA and benchmarked with linear regression and neural networks to find an optimal prediction model. The output was showcased on a GUI using Angular JS and Bootstrap.