Human Resources Modelling using AI

Utilized multiple machine learning models to evaluate the likely-hood of employee retention. Was able to predict employee retention with 89% accuracy - however I focused on extracting insights rather than treating it as a black box. 

The situation set up by the data set is that a company which is active in the Big Data and Data Analysis space is offer in courses to some of its employees. The company is offer paid training to their employees. Fortunately, many employees have signed up for these paid job trainings. However, they have been running into the situation where upon finishing a course the employee ends up switching companies. The company would like to know whether or a candidate is going to jump ship after finishing the course. 


To summarize the results of the analysis, multiple machine learning algorithms were trained and tested on the data.  The best performing algorithm was the K Nearest Neighbors classifier which had an approximate accuracy of about 89%. However, given that the decisions involved are responsible for employee's lively-hoods - the model should not be blindly trusted. 


City development index shows the highest average importance to the model's accuracy. This intuitively makes sense as company's are spending billions to located themselves in desirable cities. The second major insight is that gender has no significant importance to the model's accuracy. Women in the workplace have historically faced bias and might be passed up for training as the may be seen as a flight risk. The data empirically shows that this bias has no validity. 



Exploratory Data Analysis

Machine Learning Model Accuracy (ROC)

Average Importance to Model's Accuracy

The full project is published on my kaggle account. Below are the links to the full write up of the project, as well as a link to my kaggle profile which hosts some of my other data analysis and machine learning projects.

Full Write Up

Kaggle Profile