The aim of this project was to create an end to end database to dashboard data analysis and visualisation pipeline for a bike store retail company that simultaneously updates to accommodate changes made to the database. This was achieved by using SQL to query a database in SQL Server Management Studio, excel to perform data analysis using pivot tables and create a preliminary interactive dashboard, and Tableau to create the final interactive dashboard accessible online for the convenience of the company.
The database used was a hypothetical bike store sales dataset that can be found in google sheet form via: https://docs.google.com/spreadsheets/d/1ESMiCguVJjUzjVNxLffngDrHsQcMFHrt/edit#gid=1194135803
Before quering the database, a schematic was made to better understand the database and help structure the query.
Because all of the desired information was in separate tables within the database, this query was written to extract the customer name, order date, product sold, product brand, quantity, price and the store it was sold at. To ensure that there were no null values present in the query output, the city and state columns of the customer table were set to not null as these two columns were found to allow null values in their configuration.
Then the database was connected to an excel spreadsheet using the above query.
Pivot tables were created for summary statistics that could be used in the dashboard including revenue for year, month, state, store, brand and category.
Once the Excel spreadsheet was connected to Tableau, a secondary dashboard that can be easily accessed online was created.
The final Tableau dashboard has a few features such as filtering by year using the menu top right, and filtering by state using the drop down menu. The dashboard can be accessed via: https://public.tableau.com/app/profile/genna.dias/viz/98BikesSalesDashboard/Dashboard1
The most noticeable insight is that the revenue for 2018 is far less consistent throughout the year compared to 2016 and 2017. While not immediately visible just looking at total revenue for the years, looking at the monthly revenue shows that 2016 and 2017 saw consistent sales throughout the year, whereas 2018 had a record high spike in revenue in April that then dropped off to incredibly low sales. To ensure that this wasn't due to missing data for the latter of 2018, a query was written to filter for dates greater than April of 2018 and found that data was present for every month of the year following April.
The store with the greatest revenue by far was the New York store Baldwin Bikes which since 2016 has made up 68% of the revenue over the years. The store with the least revenue was the Texas store Rowlett Bikes which only made up 11% of revenue from the start of 2016 to the end of 2018. To ensure this revenue distribution was not due to any of the stores not being open yet, queries filtering by state and date showed that all three stores existed from the start of 2016.
Upon further investigation into 2018, using pivot tables for summarising the stores and the revenue per month, filtering by 2018 shows that all three stores had months where no sales were made. This is incredibly concerning and did not occur in 2016 and 2017. This sudden decrease in revenue experienced by all stores assuming no issues in data collection may suggest that improving infrastructure for facilitating online shopping may be more beneficial for the company compared to investing into more brick and mortar shops.