The main task of this assignment was to identify the optimal minimum dataset for a Bayesian Network and perform subsequent inferences. Initially, there were 41 variables which proved to be challenging to narrow down, instead averaging of the categories, except the demographic variables, through feature engineering proved to be more beneficial.
A heatmap analysis was conducted to determine the most relevant variables for inclusion. As can be seen on the heatmap above, the "Type of Customer" variable emerged as a significant factor due to the many strong correlations with the other variables averaged. Furthermore, above, an expertly labeled Directed Acyclic Graph (DAG), a learned DAG and a comparative DAG was plotted. The expertly labeled DAG was ultimately used to analyze the relationships between purchase intentions and the selected variables.
Key findings in this analysis includes the following:
Regular customers have higher purchase intention levels than need-based customers
Perceived product quality significantly influenced purchase decisions
Customer trust played a vital role in purchase likelihood
Convenience and physical environment were very important factors in influencing purchase intentions
A moderate level of price sensitivity, empathy and customer trust was most effective when at moderate levels
Please see the following link for the coding of the above assignment and a more extensive interpretation: