Introduction
In our project, we use three complex predictive models to better analyse and estimate Customer Lifetime Value (CLTV). Each model is chosen based on its unique skills in analysing customer behaviour data and generating actionable insights. The Decision Tree Classifier is used because it is easy to understand and effective in identifying the main causes of client churn. This model assists in making educated decisions by revealing the paths via which customer attributes influence churn.
In addition, we use the Gaussian Mixture Model (GMM) and K-Means Clustering to explore deeper into consumer segmentation. The GMM is particularly good at identifying latent categories within the consumer base, allowing for focused marketing strategies that cater to the specific needs of diverse customer groups. K-Means Clustering supplements this by giving a simple segmentation of clients, allowing us to customise retention methods and optimise marketing budget allocation based on clustering results. These models work together to create a rigorous framework for predicting CLTV, allowing for more personalised customer engagement and retention efforts.
Data Transformation
The objective was to find the non-tenured customers who fall in the same bucket as tenured customers. Cluster 3 achieves this where there are the highest number of tenured customers. This means the non-tenured customers on the cluster also tend to stay long with the company. But due to some reason, they tend to churn out evident from the churn rate of the cluster which needs to be further investigated. Cluster 2 has the lowest churn rate and a decent tenured - nontenured split, which also makes the customers in this cluster unlikely to churn out and highly valuable.
Decision Tree Classifier
This model is used to classify customers into those who will churn and those who will not.
Decision Trees are effective for classification as they mimic human decision-making by splitting data into subsets based on features, making it intuitive and easy to understand. They handle both numerical and categorical data well.
The tree splits nodes on features that result in the most significant entropy reduction (i.e., information gain). You have limited the tree depth to prevent overfitting, a common problem with decision trees.
It helps identify which features are most influential in predicting churn. For example, contract type, age, and monthly charges are significant predictors.
The model’s performance is assessed using accuracy and the F1 score to balance the precision and recall, especially important when dealing with imbalanced datasets like churn.
Results -
Gaussian Mixture Model (GMM) for Clustering
GMMs provide a probabilistic model for representing normally distributed subpopulations within an overall population. The model is used here likely because of its flexibility in accommodating mixed distribution shapes and its ability to provide soft clustering.
It is employed to understand customer segments by clustering the customer data into groups that exhibit similar behaviours and traits.
We have employed PCA (Principal Component Analysis) for dimensionality reduction before clustering, which helps in reducing the complexity and removing correlations among features.
The Silhouette Score is used to determine the optimal number of clusters by evaluating how similar an object is to its cluster compared to other clusters.
Results -
K-Means Clustering
This clustering method is used alongside GMM to refine customer segmentation.
K-Means is simple and effective for partitioning data into a pre-defined number of clusters and is computationally efficient.
It identifies clusters by minimizing the within-cluster sum of squares.
We have evaluated different cluster counts using the silhouette score to find the best segmentation, which informs strategies to address churn in different customer segments.
Results -
Survival Analysis
Survival analysis is applied to estimate the time duration for which a customer will continue with the service before churning.
This is particularly useful for data that may be censored or where the event of interest (churn) has not yet occurred for all subjects (customers) by the end of the observation period.
Kaplan-Meier estimators are used to estimate the survival function from lifetime data, allowing you to compare the churn risk across different clusters or segments.
Results -