
Undergraduate Thesis

[Aug 2019 - Dec 2019]

  • Pursuing my bachelor's thesis at the Sirimulla Research Group, UTEP (The University of Texas at El Paso, US) under Dr. Suman Sirimulla and Dr. Vaibhav A. Dixit. Topic of my thesis - Machine learning based Phase I and II drug metabolism prediction.
  • Collected, formatted, labelled and analyzed data from the world’s largest chemical database, ChemBL. Used SQL and python for this task.
  • Currently working on developing Cytochrome enzyme inhibitory models using Knime Platform.

Google Summer of Code '19 Intern

Mentor - Daniel Zerbino

[May 2019 - Jul 2019]

  • Applied machine learning techniques to characterizing and naming lncRNA genes, for Ensembl organization, as part of Google Summer of Code 2019.
  • Ensembl aims to provide a centralized resource for geneticists, molecular biologists and other researchers studying the genomes of our own species and other vertebrates and model organisms.
  • Used RESTful APIs and python to collect and analyze data from two of the world’s major gene annotation databases, Ensembl and RefSeq.
  • Generated features and transformed the data in various ways to benchmark a selection of machine learning techniques.
  • Provided a platform to investigate differences between these two databases.

Machine Learning Engineer

[Jun 2018 - Dec 2018]

  • Team Pixxel is a student team from BITS Pilani working on developing a constellation of nanosatellites for remote sensing operations.
  • Worked with Multispectral/Hyperspectral images, using deep learning models (LSTMs), for crop yield prediction and for finding illegal mines.
  • Used QGIS toolkit, python and masking techniques to extract 666 districts of India from Landsat 8 images.

Deep Learning Intern

[May 2018 - Jul 2018]

  • Implemented a Neural Machine Translator for a sports analytical firm, Messy Fractals, using sequence to sequence architecture in PyTorch framework so as to translate from English to other regional languages.
  • Wrote a web crawler to scrape, clean and assimilate text data as inputs to NLP model.
  • Visualized word embedding clusters (on tensor board) of Hindi and English text using t-SNE and K-Means algorithm.
  • Built a Language Model using word embeddings to predict the next word / set of words for Hindi Text.
  • Used Mozilla TensorFlow Deep Speech to implement a Speech2Text model for converting sports commentary to its text format.

Data Scientist & AI Intern

[Jun 2017 - Jul 2017 ]

  • Breathe Well-Being is a digital health platform helping people across the chronic disease spectrum to set and reach their goals.
  • Built a chatbot using motion.ai framework which actively interacts with users and helps them create customized nutrition plans and monitors their progress. The Chatbot helps reduce the time spent by the nutritionists with the users by up to 30%.
  • Piloted an AI driven tool that helps Ophthalmologists accurately predict the particular stage of an eye disease and differentiate between the different retina-based eye diseases.