For the past few months I have been learning about the six stages of the data analysis process (ask, prepare, process, analyze, share and act) and how to use technical tools in each stage to gather insights from large amounts of data. This is a capstone project to highlight the skills learned during the course and to showcase my understanding of each step of the process.
The director of marketing believes annual membership is more profitable than casual members hence the future succes of the company depends on maximizing the number of annual memberships. The questions i asked were how do casual and annual members use Cyclistic bikes differently and how can you convert casual members to annual members.
To answer these questions I will be analyzing historical Cyclistic bike trip data for 12 months from october 2021 to september 2022. The data is primary and collected by Cyclistic and stored on the company's website therefore the data is reliable and free of bias.
Key facts and constraints :
Personal customer information has been removed for privacy issues
Classic bikes were previously labeled docked bikes, they are the same thing.
Classic bikes must start and end at a station, but electric bikes have a lock on them, thus enabling them to start or end a trip in the vicinity of a docking station.
Trips under 1 minute and more than 24 hours should be removed as it is a maintenance trip.
To combine and clean the data I used excel and SQL. Below is an outline of my process :
Replaced 'Docked bikes' with 'Classic bikes'.
Removed trips where start or end station columns were empty only for classic bikes.
Removed trips that had null values in start/end latitude and longitude.
Replaced null values in starting and ending station columns with 'on bike lock' for electric bikes only.
Created ride length, day, month and year columns for easy analysis.
Removed trips where ride length was less than 1 minute or greater than 24 hours.
Created a dataset on Google's Bigquery platform and Imported 12 files for each month on to the dataset.
Combined all the 12 tables into a single table containing all the data from oct 2021 to sept 2022.
Now that the data was clean it was time to analyze it and answer the business questions 'How do annual and casual members use the bike differently' and 'How to convert casual members to annual members.
To analyze the data I used pivot tables to summarise my data by calculating average ride time for casual and annual member and to calculate average ride time of all members on each day of the week. The pivot tables can be viewed below.
I then moved to PowerBI to create more visualizations to summarize my data which can be viewed below.
As you can see from the above tables casual riders had a longer ride time compared to members which shows us that casual riders used the bike for longer leisure rides on weekends while members used it for commuting shorter trips during the weekdays.
From the above PowerBI visualizations you can see saturday and sunday had the highest number of rides. And you can also see the top 5 start stations for the riders.
From the total number of rides per month and the calculation from the pivot tables we can determine that casual riders used the bike for leisure concentrated on the weekends and during the summer months while annual members used the bike for commuting purposes all year on weekdays.
Casual riders average ride times is 21:50 mins, almost twice as that of annual members.
Having found out the difference between casual riders and annual riders, the marketing department can start preparing their campaign to convert the casual riders to annual.
Recommendations :
A new membership could be offered for the weekends where the riders could have unlimited use of the bikes on Saturday and Sunday.
A quarterly membership during the summer where the number of rides were maximum.
The marketing campaign could be run during the summer months for maximum reach.