Customer Segmentation Analysis Using Machine Learning
Customer Segmentation Analysis Using Machine Learning
The goal of this project is to perform customer segmentation analysis for an e-commerce company. By analyzing customer behavior and purchase patterns, we aim to group customers into distinct segments. These segments will inform targeted marketing strategies, improve customer satisfaction, and enhance overall business performance.
To segment our dataset
Perform customer segmentation analysis to uncover insight on customer behavior purchase patterns, Marketing campaign results.
Provide insight and recommendation for the marketing team.
The dataset contains 2,205 rows and 39 columns, including:
Customer Demographics: Income, education, marital status, age, etc.
Purchase Behavior: Amount spent on wine, fruits, meat, fish, sweets, and gold.
Engagement Metrics: Number of web visits, store purchases, catalog purchases, etc.
Marketing Campaign Responses: Acceptance of campaigns 1-5.
Checked for missing values (none found).
Converted column names to lowercase for consistency.
Removed irrelevant columns (z_revenue, z_costcontact).
Detected and handled outlier on the income column
Created new features:
Education: Categorized as Postgraduates, Graduates, or Undergraduates.
Family Size: Sum of kidhome and teenhome.
Marital Status: Simplified into "Single" or "Partnered."
Age Group: Categorized as Young Adult, Middle-Aged Adult, Adult, or Senior.
Income Group: Divided into low, mid, high, and very high based on quartiles.
Engagement Score: Sum of web purchases, catalog purchases, store purchases, and web visits.
Total Amount Spent: Sum of spending across all product categories.
K_Marital and K_Education: Applied one-hot encoding to transform categorical variables, such as marital status and education, into numerical format using NumPy’s select function for improved model compatibility.
Analyzed key metrics:
Average income: $48,613.
Average total spending: $473.
Average engagement score: 16.5.
I Conducted exploratory Analysis on the datasets to a have better understanding of my datasets and check for outliers.
I used K-means to segment my datasets to identify high value customers, analyze customers based on demographics and purchase behavior.
Please click on the pictures above to explore the subsection of project