In this project, I explore the relationship between marketing promotional budgets and sales. The dataset includes information about marketing campaigns across TV, radio, and social media, as well as how much revenue in sales was generated from these campaigns. To guide decisions about where to focus future marketing efforts, I used Python to perform exploratory data analysis, to create visualizations to explore relationships between variables and evaluate linear regression assumptions, to build and fit a simple linear regression model, and to summarize the model for evaluation and interpretation.
Key Insights
For each million dollars spent on TV budget, there is about 3.56 increase in sales revenue.
The relationship between TV budget and sales is statistically significant (p<.05) with a high degree of cetainty that the modeled relationship is accurate (95% confidence level that the increase in sales is between 3.558 and 3.565).
TV budget was very strongly related to sales while the relationship was less clear for radio and even less for social media.
I continued to analyze the small business' historical marketing promotion data, including additional variables, to predict sales. I used Python to explore and clean the data, used plots and descriptive statistics to select the independent variables, created a multiple linear regression model, checked model assumptions, and interpreted model outputs to share results with stakeholders.
Key Insights
The model explains about 90% of the variance in sales (adjusted R-squared of .904).
The relationships between TV budget and sales and radio budget and sales are statistically significant (p<.05).
A high TV budget was the best predictor of increased sales.
Not changing the radio budget, if a low TV budget is used rather than a high TV budget, on average sales would be expected to be 154.29 less, with 95% confidence that the sales would be between 163.98 and 144.62 less.
When TV budget is constant, radio budget has a positive relationship (about 3x) with sales.
Extending the analysis, I used the data to run a one-way ANOVA and a post hoc ANOVA test with Python to determine if the sales are significantly different among groups. I used plots and descriptive statistics to select a categorical independent variable, created and fit a linear regression model, checked model assumptions, performed and interpreted a one-way ANOVA test, compared pairs of groups using an ANOVA post hoc test, and interpreted model outputs.
Key Insights
The differences in mean sales between high and low TV budget, high and medium TV budget and low and medium TV budget are all statistically significant (p <.05).
With each increase in TV budget, from low to medium to high, there is a statistically significant increase in sales, with about a 100 million increase in sales for each level of increase in the budget.