Tools: R (tidyverse)
Task Summary:
Analyzed smart fitness device usage data from 5,000+ users to identify behavioral trends by time of day and age group, supporting data-driven marketing strategy improvements.
Key Methods:
Cleaned and organized user activity data using R and tidyverse packages.
Conducted exploratory data analysis to uncover trends in device usage across age groups and time slots.
Identified peak morning usage among users aged 18–34, informing campaign timing for targeted marketing.
Visuals:
Plots showing step count with active vs. sedentary minutes
Plots showing calorie count with very active vs. sedentary distance
Tools: Python (Pandas, Matplotlib)
Task Summary:
Performed exploratory analysis and hypothesis testing on health insurance data to examine cost differences based on smoking status and regional BMI variations.
Key Methods:
Conducted EDA on age, BMI, smoking status, and charges.
Applied two-sample t-tests to compare charges between smokers and non-smokers (~4x higher for smokers).
Used Kruskal-Wallis tests to identify significant BMI differences across regions.
Visualized findings for stakeholders to guide targeted wellness initiatives.
Visuals:
Plots of charges by smoker status.
Regional BMI distribution plots.
Tools: Python (Pandas, mlxtend, Matplotlib)
Task Summary:
Performed market basket analysis on high-priority transactional data to identify frequently purchased product combinations, supporting targeted bundling and inventory optimization for Allias Megastore.
Key Methods:
Preprocessed and transactionalized order data for high-priority purchases.
Applied the Apriori algorithm to extract frequent itemsets with support, confidence, and lift metrics.
Generated association rules to uncover strong product pairings for bundling opportunities.
Provided data-driven recommendations for marketing and inventory planning.
Visuals:
Bar chart of top purchased products.