Course Projects

Explainable Movie Recommender

Developed a transparent and explainable movie recommendation system in Python. The tool shows what genres the user typically watches and explains the individual recommendations in natural language
Library Used: numpy, scipy, panda, nltk

Image Classification

Developed a simple image classification pipeline from scratch based on the k-Nearest Neighbor, SVM/Softmax classifier and Two layer neural network on CIFAR-10 Dataset
Implemented Two Layer Fully Connected Network and Three Layer CNN Network from scratch, also played around with Dropout and batch normalization on CIFAR-10 Dataset. The model achieves around 50% accuracy
Implemented DNN architecture similar to ResNet using PyTorch and the model achieves around 72% accuracy on CIFAR-10 Dataset
Library Used: numpy, scipy, matplotlib, PyTorch

House Price Prediction

This is part of the Kaggle Competition where you’re challenged to predict the final price of each home with 79 explanatory variables describing (almost) every aspect of residential homes in Ames, Iowa.
My submission is top 6% in Kaggle where I've used advanced regression technique to predict the final price like ensembling StackedRegressor, XGBoost and LightGBM
Library Used: Pandas, NumPy, Matplotlib, scikit-learn

Real or Not? NLP with Disaster Tweets

This is part of the Kaggle Competition where you’re challenged to build a machine learning model that predicts which Tweets are about real disasters and which one’s aren’t
I've done exploratory data Analysis, preprocessing, wordcloud for common words in real and non-real tweets and build a DNN model with Bidirectional GRU with accuracy of 71.437%

WebCrawler

WebCrawler built in python that traverse the web associated with user-specified root URL address using Iterative Deepening Search (IDS) algorithm
The program saves each url's HTML to a file and runs a Character Unigram Feature Extractor on those files

Topic Modeling with LDA

News Category Prediction

Adopted logistic regression to predict news category (i.e., Entertainment, Games) with accuracy of 87%
Library Used: numpy, pandas, sklearn

cpmFS

Designed and implemented a simple file system called cpmFS which allows users to list directory entries, rename files, copy files, delete files, as well as code to read/write/open/close files.

pWordCount

pWordCount is a pipe-based word count tool, where two processes are coorperating through unix pipes to count word in a text file