OVERVIEW
A Bank sought to move beyond generic marketing offers by delivering personalized Christmas promotions to its customers. The core problem was the need to group customers into distinct segments based on their behavior, allowing for more targeted and efficient marketing efforts. The bank specifically requested no more than 5 customer segments to ensure manageable campaign management.
OBJECTIVES
The primary objectives of this project were to:
Conduct a comprehensive Exploratory Data Analysis (EDA) on the Bank's customer data.
Analyze the distribution of Recency, Frequency, and Monetary (RFM) values to understand customer behavior.
Identify key characteristics and patterns within the customer base.
Determine correlations between various customer attributes.
Apply KMeans Clustering to group customers into a maximum of 5 distinct segments based on their RFM behavior.
Provide actionable insights for developing personalized marketing campaigns.
KEY COLUMNS
The analysis focused on understanding customer behavior through various attributes, including:
customer_id: Unique identifier for each customer.
age: Age of the customer.
new_customer_indicator: Indicates if the customer is new.
seniority: Length of time the customer has been with the bank (tenure).
primary_relationship_status: Customer's primary relationship status with the bank.
customer_type_at_month_start: Type of customer at the beginning of the month.
address_type: Type of customer's address.
customer_activity_index: An index representing customer activity.
income: Customer's income.
recency: Days since the last customer transaction (lower value means more recent activity).
frequency: Total number of transactions or product usages by the customer.
monetary_value: Total amount spent by the customer.
gender: Gender of the customer.
TOOLS
Data Analysis Libraries (Python): Used for data manipulation, statistical analysis, and visualization (e.g., Pandas, NumPy, Matplotlib, Seaborn).
KMeans Clustering (Machine Learning): Applied for customer segmentation.
APPROACH
The project followed a structured approach, combining exploratory data analysis with a clustering methodology:
Phase 1: Problem Description & Data Understanding
Clearly defined the business problem: enhancing marketing campaigns through personalized offers by segmenting customers.
Understood the bank's requirement of no more than 5 customer segments.
Phase 2: Exploratory Data Analysis (EDA)
Distribution Analysis: Examined the distribution of Recency, Frequency, and Monetary (RFM) values using histograms and density plots to understand the spread and skewness of customer behavior.
Box Plots: Used box plots to visualize the distribution of RFM values and identify outliers, which were considered relevant for the segmentation process.
Count Plot of Gender: Analyzed the gender distribution of the customer base to identify demographic insights relevant for marketing.
Correlation Matrix Heatmap: Explored the relationships between various customer attributes (e.g., age, income, RFM values) to understand their interdependencies.
Recency vs. Frequency & Frequency vs. Monetary Value Plots: Visualized the relationships between these key behavioral metrics, often colored by a customer activity index, to identify natural groupings and patterns.
Phase 3: Feature Engineering
Based on the problem description, Recency, Frequency, and Monetary (RFM) values were engineered as features for clustering.
Phase 4: Model Recommendation & Clustering Process
Model Recommendation: KMeans Clustering was recommended for its purpose of grouping customers based on similar behavioral patterns (R, F, M).
Clustering Approach: Applied KMeans Clustering to the engineered RFM features. The primary goal was to minimize intra-cluster variance, ensuring customers within the same group were as similar as possible. The process aimed to create no more than 5 clusters as per the bank's requirement.
KEY INSIGHTS
The EDA and clustering process revealed several key insights:
Heterogeneous Customer Base: The distributions of Recency, Frequency, and Monetary values indicated a diverse customer base, with a large majority being low-activity, low-product users, and low-spending customers, while a smaller minority represented high-activity, high-product users, and high-spending customers.
Outliers are Relevant: Outliers in RFM values, resulting from feature engineering, were deemed important for the customer segmentation process, suggesting unique customer behaviors.
Gender Distribution: XYZ Bank's customer base consisted mostly of women, a crucial demographic insight for tailoring marketing campaigns.
Correlations:
Age vs. Seniority: Older customers tended to have higher seniority, indicating a relationship between age and tenure.
Income vs. Monetary Value: Customers with higher incomes tended to have higher monetary values, suggesting a direct relationship between income and spending.
Frequency vs. Monetary Value: Customers who used more products or services tended to have higher monetary values, indicating a strong relationship between product usage and spending.
Behavioral Segments (RFM-based):
Low-Activity, Low-Product Customers: Recently used products but not many different ones, requiring targeted efforts for increased engagement.
High-Activity, High-Product Customers: Long-term users with a variety of products, valuable due to high spending and loyalty.
Low-Usage, Low-Spending Customers: Use few products and spend little, needing marketing to encourage more usage/spending.
High-Usage, High-Spending Customers: Use many products and spend a lot, representing highly valuable and loyal customers.
Moderate-Usage, Moderate-Spending Customers: Customers with balanced product usage and spending, presenting opportunities for up-selling and cross-selling.
IMPACTS
This project delivered significant impacts for XYZ Bank:
Enhanced Marketing Campaigns: Enabled the bank to move from generic offers to personalized Christmas promotions, significantly improving the relevance and effectiveness of marketing efforts.
Improved Customer Engagement: By understanding distinct customer segments, the bank can tailor communication and offers, leading to increased customer retention and loyalty.
Optimized Resource Allocation: Marketing resources can be more efficiently allocated to specific customer segments, maximizing the return on investment for campaigns.
Data-Driven Decision Making: Provided the bank with a robust, data-driven framework for understanding its customer base and making strategic decisions about product offerings and marketing strategies.
Increased Profitability (Potential): By targeting the right customers with the right offers, the project laid the groundwork for increasing customer spending and overall profitability.
DELIVERABLES
The key deliverables for this project included:
Exploratory Data Analysis (EDA) Report/Notebook: Documenting the distributions, correlations, and initial insights from the data.
Customer Segments: Defined customer clusters (up to 5) based on RFM analysis.
Model Recommendation: Justification for using KMeans Clustering.
Presentation/Summary: A report or presentation summarizing the problem, methodology, key insights, and recommendations for personalized marketing strategies.