Problem:
A business needed to segment its customers to improve marketing strategies, personalize offers, and boost retention. Traditional demographic segmentation was not delivering actionable insights.
Approach:
I combined SQL for data preprocessing and K-Means clustering in Python to segment customers based on purchasing behavior.
Used SQL (via DuckDB) to clean and explore the raw dataset.
Selected relevant features such as Annual Income and Spending Score.
Scaled the data and applied K-Means Clustering to identify meaningful customer segments.
Visualized clusters using matplotlib to show distinct customer groups.
Tools:
DuckDB: SQL queries within Python
pandas: Data handling
scikit-learn: K-Means clustering
matplotlib, seaborn: Visualization
Outcome:
The analysis uncovered three clear customer segments:
High-income, high-spending customers (ideal for loyalty programs),
Low-income, high-spending (price-sensitive but brand-loyal),
High-income, low-spending (potential for upselling).
These insights can drive targeted marketing, improve customer retention, and increase ROI on campaigns.