Hello ! I'm KirubasagarĀ
Data Analyst
Hello ! I'm KirubasagarĀ
Data Analyst
" Dream, Dream, Dream
Ā Ā Dreams transform into thoughts
Ā Ā And thoughts results in action "
Ā Ā Ā Ā Ā by Dr. APJ Abdul Kalam
Practice Highlights
I led the creation of an innovative Revenue Predictor for New York taxi drivers, focusing on tipping behavior prediction. The model, working with binary outcomes, harnessed various factors to accurately forecast customer actions. Advanced machine learning techniques, including RandomForest and XGBoostClassifier, were applied, backed by rigorous cross-validation and F1 score evaluation. The XGBoost Classifier stood out with an accuracy of 62.56% and F1 score of 35.78%. Crucial predictors like 'predicted fare', 'mean distance', and 'mean duration' emerged, offering potential to revolutionize revenue strategies for taxi drivers.Ā
Python-driven Content based movie recommendation system utilizes TF-IDF and cosine similarity for on-demand, personalized film suggestions. The incorporation of interactive widgets ensures a seamless user experience, while the application of advanced NLP techniques showcases data-driven precision and user-centric design, ultimately enhancing the quality of movie recommendations.Ā
A dynamic fusion of medical knowledge and data-driven clarity. This dashboard harmonizes global COVID-19 data, presenting a visual exploration of confirmed cases and recoveries across nations. It offers a positive path towards informed strategies and impactful decisions in the fight against the pandemic.
An analysis was conducted to determine if a statistically significant difference exists between video view counts and Account verification status on TikTok. The Hypothesis test, specifically a two-sample T-Test (A/B Test), was employed for this purpose. The obtained p-value was remarkably small, far below the standard significance level of 5%. Consequently, the null hypothesis was rejected, leading to the conclusion that a notable and statistically significant distinction exists in the mean video view counts between verified and unverified TikTok accounts.Ā
I developed a churn prediction model for Waze to enhance retention and business growth. By analyzing variables, training a logistic classifier, and evaluating with metrics like Accuracy and F1 Score, the model achieved an 82.37% accuracy. Surprisingly, 'km_per_driving_day' showed strong correlation with churn despite its lower importance in the model, providing Waze insights to optimize retention strategies and user experience.Ā
To understand the link between Sales from ads and Streaming Services Budget, we use Ordinary Least Squares (OLS) regression with TV and Radio as variables. The multiple linear regression model with these variables shows an R² value of 0.904, indicating that 90.4% of Sales variation is explained by the model. This highlights a strong connection between advertising expenses on TV and Radio and resulting Sales.Ā