Projects

Some projects I worked on

Diagnolytics: Predictive and trend analytics for preventive healthcare

Feb '18

Aishwarya T, Dhanya Bhat, Srideepika Jayaraman, Twinkle Tanna

SheHacks Boston was a 36 hour all-female hackathon.

We took part in Hacking Healthcare with AI sponsored by IBM and won 1st place!

Diagnolytics is designed to serve as a platform for physicians to view detailed analytic reports of their patients and surroundings to help them predict and prevent diseases in patients beforehand. With analytics on the distribution of patients, their conditions, their situation and geography, predictions are tailored to each patient using which the physician can make personalized diagnosis. Each patient record contains history of their illnesses, insurance claims trends(in patient, out patient and carrier) and the follwing predictions:

prediction on the amount of insurance they may claim in the subsequent year
predictions on the future diseases they may contract
similarity of conditions of other patients.

These predictions were made with Data from Centers for Medicare and Medicaid Services (CMS.gov). The data comprised of insurance claims from Medical beneficiaries . We applied linear regression to predict insurance claims to help retirees plan finances better. Performed K-Nearest Neighbors to identify patients with similar diseases and medication to indicate severity and likelihood. We also used Tableau for visualization and presentation.

Photo-realistic Image Generation with Generative Adversarial Networks (GANS)

Jan '18 - May '18

Srideepika Jayaraman, Twinkle Tanna

Poster Link!

As a part of the course COMPSCI 682 Neural Networks: A Modern Introduction , my teammate and I developed this project under Prof. Erik G. Learned-Miller.

Generating photo-realistic images from text captions is a challenging problem and could be applied to many fields like identity generation for sketching faces, Computer Aided Design, design of room interiors. Adding features like facial hair, face wrinkles, type of hair or teeth to an approximation of a criminal as described by a witness is another potential application. Text such as 'The bird is dark grey brown with a thick curved bill and a flat shaped tail' is visually descriptive but the problem is still hard as there are many arrangements of colors and shapes that fit the description.

We used a Generative Adversarial Network model where the generator networks strives to fool the discriminator which is trained in an adversarial manner to believe that the synthesized image is real. The work is inspired from StackGAN++. Stage 1 comprises of a Generator-Discriminator which first generates low resolution images(64*64) from text captions. Stage 2 captures more detail and enhances he resolution to 256x256. Our enhancements were:

Using InferSent over SkipThought sentence vectors as they perform better in caption retrieval tasks.
Adding additional stage to produce 512*512 resolution.
Trained stages in parallel by feeding half the sentence to each.
Future Work involves adding attention to parts of the text for better details.

Kwery Korrection

Sept '18 - Dec '18

Sanjay Reddy, Twinkle Tanna

Report Link!

Github

As a part of the course COMPSCI 646 Information Retrieval , my teammate and I developed this project under Prof. James Allan and Hamed Zamani.

Searching on the web is a lot easier now with features like auto-correct and 'did you mean'. It is important to get the right query and word sense across to avoid ambiguity and include personalization. We aimed to solve the task of correcting queries by correcting their spellings and the spaces between words. The project draws a comparison between four traditional(Peter Norvig's approach, Symspell, auto-correct and dictionary based approach) and four deep learning(word vectors, and word and character level machine translation approaches) based approaches to query correction. An end-to-end system was built with 250 queries which were first injected with noise and then passed through each of these eight spell correction techniques. Symspell and Word Vector approached performed the best in the traditional and deep learning approaches respectively. MAP and nDCG were the primary evaluation strategies used.

Candle

April '18

Daniel Sam Pete Thiyagu , Sanjay Reddy, Srideepika J, Sunny Katkuri, Twinkle Tanna

Link!

Youtube Submission Link!

PerkinsHacks is organized by Perkins School for the Blind and looked to address everyday issues that are faced by the blind community.

We aimed to solve the real world problem faced when forms need to be filled out. A visually challenged person often needs to involve another person and may have to divulge personal information to them. Involving Alexa helped us impart the feeling that you are talking to a person and the information is confidential at the same time. We targeted PDF forms and HTML forms that are available online. Alexa with the Amazon Lex API then helps the person fill out the form interactively. We also integrated an OCR portion that helps read menus at a restaurant.

Novella the Tale Weaver

Nov '17

Alex Lamson, Chris Raff, Srideepika J, Twinkle Tanna

Link!

HackUMass is a 36 hour hackathon organized by the University of Massachusetts Amherst.

We created a chatbot with a story-building personality themed on Harry Potter. We used the Amazon Lex API to communicate with the user. The stream started with the user typing in or speaking a sentence. Novella replies with a sentence in order to interactively build a story with the user. We used Markov chains for predicting the next sentence.

Wordify : A Reverse Dictionary for Everyone

Sept '17 - Dec '17

Jay Shah, Twinkle Tanna

Poster Link!

As a part of the course 585: Introduction to Natural Language Processing, my teammate and I developed a reverse dictionary under Prof. Brendan T. O'Connor.

Wordify aims to produce the word you had right at the tip of your tongue but you can't just can't seem to remember it. We used word embeddings to give a one-word summary of phrases. The system was trained on Wikipedia definitions and we used spaCy word embeddings on the WordNet Dataset. We calculated a baseline accuracy with vector space operations and later trained an LSTM Neural Network.