For my project, I utilized time decision trees and ensemble machine learning models to predict customer churn. I faced the challenge of data imbalances by employing tree-based models and the "balanced" parameter to ensure accurate predictions.
To evaluate the performance of the models, I used recall-precision curves and employed grid search for hyperparameter tuning to maximize "recall".
With this project, I Web performed Scrapping of Job portals, to extract raw data. Following that I performed data cleaning and transformation. And Finally I was able to convert this data into knowledge knowledge via visualizations.
Tools and Techniques:
Google Colab, Python, Beautiful Soup, Matplotlib, Seaborn, numpy, Pandas
I created a dashboard for Altik Hardware using SQL to query, transform, and clean data. The data was then loaded into Power BI to create an efficient dashboard for managerial decision-making.
Tools and Techniques:
Excel, PowerBI, MySQL, Advanced SQL Queries
I am working on building an NLP pipeline for the use case of information extraction from news articles.
Extract features Information from unstructured news articles and transforms them into a structured relational database.
The Code will trigger an update of an S3 (AWS) data bucket. Which will feed a BI tool like Tableau to visualize the queried results in a dashboard.
Tools and Techniques:
Jupyter Notebooks, scikit-learn, Pandas, Numpy, spaCy, AWS, transformer models, Hugging Face, and Tableau.
This project is WIP🏗️👷
In an era where data-driven decision-making reigns supreme, statistics serves as the backbone of our analytical endeavors. With the advent of machine learning (ML) and artificial intelligence (AI) applications, the power of statistics has only grown more pronounced.
With My Project "Deceptive Statistics", I dive into nuanced elements of statistical analysis that can lead to deceptive conclusions and explore strategies to navigate these pitfalls effectively.
Through this project, I aim to shed light on common statistical fallacies, such as correlation vs. causation, Simpson's paradox, and the dangers of extrapolation.
This project is WIP.🏗️👷