Semester-3

Projects

Title: SENTIMENT ANALYSIS On IMDb Movie Reviews

Synopsis: Using a Bag of Words to train Naïve Bayes Classifiers and determine which of them is best suited for Sentiment Analysis on IMDb Movie Reviews dataset and predicting the Sentiment with the best Classifier.

 Student Name :

----------------------------------

Supervisor: Mr. Taranga Mukherjee 


Title: Predicting default payment of credit card clients

Synopsis: Banking or financial institutes plays a significant role in providing financial service. To maintain the integrity, banks or institutes must be careful when investing in customers to avoid financial loss. The model we built in this project will use all possible factors to predict data on customers to find who are defaulters and non-defaulters next month. The goal is to find whether the clients are able to pay their next month credit amount and which factors are important in detecting defaults. The dataset.

 Student Name :

----------------------------------

Supervisor: Mr. Taranga Mukherjee 

Title: Predicting the risk of heart disease : A Classification Problem using Boosting techniques

Synopsis: The data-in-hand consisted of some basic information about a person and numerous other factors that can be a reason for heart disease of a patient. These factors include smoking, alcohol consumption, etc. It can be observed that the given data at hand is a multivariate data. Our job here is to build a classification model using different classifiers that will help us to classify whether a person has the risk of having a heart disease within the upcoming ten years based on the given data at hand. We are going to use three types of boosting classifiers, namely Adaboost Classifier, Gradient Boost Classifier and Extreme Gradient Boost Classifier. We need to solve for data imbalance, if any, and then we will move on to building our classification model. We will use the model for predictions and we will evaluate our model performance using some performance measure. Basically, we will compare which technique is giving us efficient predictions with our dataset.

 Student Name :

----------------------------------

Supervisor: Mr. Taranga Mukherjee & Mr. Mayukh Bhattacharya.

Title: Predictive power of ML models for Customers Default Payments in Taiwan

Synopsis: This project explains basic concepts and methodologies of credit risk modeling and how it is important for financial sectors. The main objective of this project was to build classifiers that would be able to identify defaulters and therefore help to minimize company loss. The best model possible would be the one that could minimize false negatives, identifying all defaulters among the client base, while also minimizing false positives, preventing clients to be wrongly classified as defaulters.

 Student Name :

----------------------------------

Supervisor: Dr. Hiranmoy Mondal & Mr. Mayukh Bhattacharya.

Title: Recommender system for movies - A comparative study using different approaches.

Synopsis: The Recommendation system of any web-platform adds a whole new dimension to the user's experience by providing real-time personalized recommendations to its users. It takes a collaborative social-networking approach where a user’s own tastes are mixed with that of the entire community to generate meaningful results. In this project we have used, Content based recommendation system ,Collaborative filtering recommendation system , Clustering based recommendation system

 Student Name :

----------------------------------

Supervisor: DR. Prasanta Narayan Dutta, Dr.Banashree Sen & Ms. Anwesha Sengupta

Title: Psychographic Segmentation - A Cluster Analytic Approach

Synopsis: Strategically segmenting your brand’s audience into smaller sub-groups allows you to create stronger, more targeted brand campaigns, attract more qualified leads, and identify unique new opportunities within existing markets. The analysis is done using SPSS 21.0. The computer output is obtained by first doing a hierarchical cluster analysis to find the number of clusters that exist in the data. These outputs are in Agglomeration schedule, vertical Icicle Plot and Dendrogram using Average Linkage. The second stage is a K-means (quick cluster) output with a pre-determined number of clusters to be specified. The study identifies 3 clusters from the process which helps to target the effective segments as per the profile of the clusters.

Student Name :

----------------------------------

Supervisor: Dr. Sankar Prasad Mondal