Cyclistic Bike-share Analysis
In this case study, data from a fiction bike-share company called cyclistic is analysed to help them attract more riders. Working along with the marketing analyst team, our director believes that the company's succes depends on increasing the number of annual memberships. Hence they are looking forward to find all possible differences in how casual riders and members use cyclistic bikes. According to the insights gained from the analysis, our team is supposed to introduce new marketing strategies in order to convert casual riders into annual members.
The analysis is expected to answer three main questions, that would further help us execute proper marketing strategies. They are:
How do annual members and casual riders use Cyclistic bikes differently?
Why would casual riders buy Cyclistic annual membership?
How can Cyclistic use digital media to influence casual riders to become members?
Amoung these our marketing director Lily Moreno has assigned me the first question to answer.
All twelve month bike-share data was provided by the company through amazon web service. The data was downloaded from this link. The data was made available by Motivate International Inc, under this license.
The downloaded twelve month data was not properly organised and hence was not ready for use. Every month's bike-share data was made into individual CSV file and each CSV file was too large for it to be handled by spreadsheets. Hence R programing was used in data manipulation and organisation.
Every changes done to the data have been carefully mentioned below:
All twelve month dataset was added to R-Studio and merged into one dataframe.
There were some missing values in some columns. Since these values cannot be obtained and these columns do not interfere with our analysis they were omitted from the dataset.
The date was seperated out from the 'started_at' column into a new column.
A new column named 'ride_length' was created which contained the time difference between ending and starting of each rides.
The values of ride length was then converted into numeric datatype for calculations.
Another column named 'day_of_week' was created that showed the day of each bike ride.
Finally a summary dataset was created which showed the number of rides and average ride duration for each day of week, for each bike type by members and casual riders.
[ Feel free to head over to the analysis section for insights and data visualisations.]