Academic/Industry Projects

  • Worked on Search Relevance for this product.

Knowledge Extraction from Unstructured Text using bi-directional LSTM and hierarchical LSTM

  • Developed deep learning models to extract knowledge in terms of (Subject, Object, Predicate) from unstructured text.
  • Leveraged Microsoft's Satori Knowledge Graph for Distant Supervision
  • Using Tensorflow

Capability Categorization and Slot Detection in Spoken Language Understanding

September 2015 – December 2015
  • Designed two components for capability categorization and slot detection
  • Capability categorization : The task is to categorize the spoken utterance based on which the dialog system can respond
  • Slot detection : The task is to detect the slots in spoken utterance based on which the dialog system can find entities in knowledge base and form appropriate response.
  • Used Maximum Entropy model for capability categorization and Conditional Random Fields for slot detection
  • Used Python

Activity Recognition using Recurrent Neural Networks

October 2015 – December 2015
  • Developed an Activity Recognition system using Artificial Neural Networks.
  • Goal is to recognize the daily activities performed by humans like eating, cooking, walking, riding, exercising, eating, driving,etc. based on sensor data.
  • Implemented Multi-layer Perceptron(MLP) network with different neuron activation functions( rectified linear unit, sigmoid, hyperbolic tangent) to recognize activity type using time series sensor data. 
  • Achieved 87% 5-fold cross validation accuracy and 84% 80/20 train-test split accuracy.
  • Dataset used: Skoda Mini Checkpoint dataset

Automatic Gloss Extraction from Large Scale Datasets like Wikipedia, ClueWeb09 and Wiktionary for Gloss free Knowledge Bases(Link)

August 2014 – May 2015
  • I worked under Prof. William Cohen ( Machine Learning Department, School of Computer Science, Carnegie Mellon University) on a component in NELL(Never Ending Language Learner) ecosystem that extracts information from large scale natural language corpus using Machine Learning and Natural Language Processing algorithms, that will add to reasoning power of gloss free Knowledge Bases like NELL

Collaborative Filtering for Product Recommendations

December 2014 – December 2014
  • Designed and implemented Collaborative Filtering approach to predict user profile of a new user for set of products using memory based approach and make inferences for product recommendations

Object Recognition using Unsupervised Feature Learning

November 2014 – December 2014
  • Developed a system for object recognition in images from CIFAR-10 image collection. Implemented Unsupervised feature learning framework for feature space transformation, Convolutional feature extraction methods and classification

Hidden Markov Model for Predictions

November 2014 – November 2014
  • Designed a module in Python to model Hidden Markov Model. Implemented dynamic programming based Viterbi Algorithm in this module. 
  • User needs to provide prior probabilities, transition probabilities and emission probabilities; he can obtain predictions like most likely hidden state sequence for given observed output sequence, probability of observing a sequence of output states, probability of observing a specific output given sequence of output states observed so far.

Image Segmentation through Unsupervised Machine Learning

November 2014 – November 2014

Input Text Predictor using Statistical Language Modeling

December 2014 – December 2014
  • Designed and developed Input Text Prediction system by generating statistical language model using n-grams. It starts predicting what you are going to type based on what word you entered and give predicted words in Auto-complete as you get in Google Instant.
  • Designed and developed system using MapReduce paradigm on Apache Hadoop and Apache HBase

Handwritten Digit Recognition using Logistic Regression

October 2014 – October 2014
  • Designed and implemented a logistic regression classifier to classify digits in 28x28 gray scale images from numbers 0 to 9. Dataset used: MNIST

Handwritten Digit Recognition using Neural Network

October 2014 – October 2014
  • Designed and implemented a neural network to classify digits in 28x28 gray scale images from numbers 0 to 9. Dataset used: MNIST

Twitter Analytics Web Services

August 2014 – December 2014
  • Developed system for analytics on Twitter dataset. Designed and developed system using Amazon AWS's Elastic MapReduce clusters, Elastic Load Balancing, Apache Hadoop, Apache HBase, Autoscaling
  • Languages: Java and Go

Wikipedia Traffic Analysis using Amazon Web Services- Elastic MapReduce

September 2014 – October 2014
  • Developed system to Analyze Wikipedia article dataset to derive insights related to pageviews and traffic using Amazon Web Services' Elastic MapReduce framework.
  • Using Hadoop Streaming API

AutoScaling through Custom Load Balancer on Amazon AWS

August 2014 – September 2014
  • Designed and implemented custom load balancing algorithm for compute cluster on Amazon AWS using Amazon AWS API.
  • Features to use custom metric and rules for triggering criteria for scale-up and scale-down operations

LinkedIn- Search and Recommendations)

  • Role: Software Engineer- Search, LinkedIn Corporation
  • Brief: Search & recommendations team makes the content discoverable through people and people through content. We work on complex search and recommendations problems including search relevancy, socializing search, word stuffing, contextual search and delivering right content to right people at the right time !

Content Classification Engine

  • Role: Software Engineer- Search, LinkedIn Corporation
  • Brief: Designed and developed a content classification engine using Natural Language Processing and supervised online Machine Learning Algorithms.
  • Devised the framework for defining your own content readers in different format for different sources
  • Imparted learning using domain articles for various categories
  • Designed incremental learning model to support learning of system with time in incremental fashion
  • Developed web server exposing RESTful web services for category prediction for external clients
  • Designed functionality to auto save the engine's learning state during shutdown and state restore during restart.
  • Using: Python, scikit-learn, Machine Learning, Natural Language Processing


  • With advent of internet and proliferation of smart phones, everyone is having their internet presence by the means of social & professional networks, blogs and other web resources over the internet. They are actively doing some activity on internet every moment. It becomes important to capture all the activities that a person is doing at all places to learn about him and help him take better life decisions based on real and logical facts. The data produced is so huge to process and analyze it manually to draw useful conclusions from it. This is a Big Data problem. We came up with a solution for this, its the SocioBot. SocioBot is a Social Media Analytics Engine which makes intelligent predictions, thus helps in understanding the customer better, marketing the brand in the best possible way and capturing new potential leads for the business. 
  • Technologies: C, C++, Python, Machine Learning, Natural Language Processing, Semantic Networks

Pipes: Sync your Salesforce Clouds in flash with clicks

  • Description: Developed a intuitive utility for high speed data synchronization, metadata synchronization,mirror replication of cloud environments.
  • Technologies: VB.Net,'s cloud platform(PaaS), Synchronization Algorithms on Windows

DevelopWeb : Web development IDE for OLPC XO laptops(Link)
  • Description: DevelopWeb is an Activity for Web Development using which children can develop Web Sites through HTML, Javascript and other web technologies. Children can learn quickly how to develop web pages in a step by step approach through examples provided for each HTML component.
  • Technologies: Python, GTK+, Sugar Desktop Environment, XO-1 Laptop

WikipediaHI : Offline Hindi Wikipedia for OLPC XO laptops(Link)

  • Description: WikipediaHI is a Sugar Desktop Environment activity for providing offline access to Wikipedia in Hindi language
  • Technologies: Python, GTK+, Sugar Desktop Environment, XO-1 Laptop

Oopsy : C,C++ Development IDE for OLPC XO laptops(Link)

  • Description: Oopsy is a Sugar Desktop Environment activity that will allow children to develop C/C++ programs, compile them and execute them to learn, explore and have fun!
  • Technologies: Python, GTK+ 3.0, Sugar Desktop Environment, XO-1 Laptop

  • CSC's Breezeway™ is a suite of cloud-based, Web-subscription insurance services. Among the wide range of benefits it provides to carriers, Breezeway:
  • Increases agent satisfaction by providing real-time information and streamlining customer and agent interaction
  • Improves customer retention by speeding responses to customer requests and shortening the acquisition and application cycle
  • Boosts revenue and market share by decreasing sales cycles and automating manual processes.
  • Breezeway combines cloud and mobile technology to enhance sales processes and give insurers, agents, brokers and customers anytime-anywhere access to a full range of insurance services, including automated third-party capabilities such as electronic signatures.

XS Community Edition(Link)

  • The school Server Community Edition provides communication, networking, content, and maintenance to schools and classrooms. In everyday usage the school server provides services which extend capabilities of the connected laptops while being transparent to the user. These services include:
  • Classroom connectivity – Similar to what you would find in an advanced home router.
  • Internet gateway – If available, an internet connection is made available to laptops.
  • Content – Tools to make instructional media available to their schools and classrooms.
  • Maintenance – Tools to keep laptops updated and running smoothly.

Pebble Grid Computing Framework: A generic grid computing framework and library for high performance computing(Link)

  • Developed a grid computing framework for distributed computing of complex problem with large data set
  • Designed and implemented algorithms to support distributed computing over LAN, MAN and WAN( Using port forwarding at ISP's router)
  • Implemented the framework to support different types of grid architectures: Master-Slave, Peer-to-Peer and Hierarchical model
  • Designed and developed daemons and libraries that facilitate users to submit jobs, monitor jobs to the grid.
  • Designed scheduling algorithms for job scheduling in grid systems: Priority, First in First out(FIFO), 
  • and Shortest Job First(SJF) scheduling.
  • Implemented framework feature that allows user to write custom scheduling algorithm for job scheduling.
  • Designed and developed algorithms that allowed dynamic participation of computing resources within a grid, runtime resource discovery and resource reservation. 
  • Perform experiments on high speed web crawling of large scale URL frontier for search engines to measure benchmark of framework.
  • Best suited for high speed web crawling, Image processing and any job with sufficient degree of parallelism.
  • Technologies: C, C++, BSD Sockets, Job Scheduling, Load Balancing Algorithms
  • Operating Systems: Linux, Windows