Public Domain image, https://free-images.com/display/bikeone_bicycles_in_krakow.html
INTRODUCTION
This is Cyclistic bike-share analysis case study! This analysis is based on the Google Analytics Certificate Capstone project. It involves analysis of historical data for a fictional company, Cyclistic, a bike sharing company. In this study, which is my capstone project, I am assuming the role of a junior data analyst.
SCENARIO
The director of marketing at Cyclistic, a bike-share company in Chicago believes the company’s future success depends on maximizing the number of annual memberships. Therefore, the analyst team wants to understand how casual riders and annual members use Cyclistic bikes differently. From these insights, my team will design a new marketing strategy to convert casual riders into annual members.
ABOUT THE COMPANY
Cyclistic is a bike sharing company located in Chicago. It operates more than 5,800 bicycles and 600 docking stations. In 2016, Cyclistic launched a successful bike-share offering. Since then, the program has grown to a fleet of 5,824 bicycles that are geotracked and locked into a network of 692 stations across Chicago. The bikes can be unlocked from one station and returned to any other station in the system anytime.
Cyclistic’s finance analysts have concluded that annual members are much more profitable than casual riders. Although the pricing flexibility helps Cyclistic attract more customers, marketing believes that maximizing the number of annual members will be key to future growth. Rather than creating a marketing campaign that targets all-new customers, marketing believes there is a very good chance to convert casual riders into members. This is because casual riders are already aware of the Cyclistic program and have chosen Cyclistic for their mobility needs.
The primary goal of the marketing team is to design marketing strategies aimed at converting casual riders into annual members.
In order to do that, however, the marketing analyst team needs to better understand how annual members and casual riders differ, why casual riders would buy a membership, and how digital media could affect their marketing tactics.
BUSINESS TASK
The business task set for this analysis is to answer the question: How do annual members and casual riders use Cyclistic bikes differently?
The report below contains the following deliverables:
A clear statement of the business task
A description of all data sources used
Documentation of cleaning or manipulation of data
A summary of analysis
Supporting visualizations and key findings
Top three recommendations based on analysis
STAKEHOLDERS
The stakeholders for this project include:
Lily Moreno, the director of marketing. She is responsible for the development of capaigns and
initiatives to promote the bike-share program.
Cyclistic marketing analytics team. A team of data analysts who are responsible for collecting,
analyzing, and reporting data that helps guide Cyclistic marketing strategy.
Cyclistic executive team, who takes the final decision on the recommended marketing program.
DATA LOCATION AND ORGANIZATION
The data used for this analysis are Cyclistic’s historical trip data made available by Motivate International Inc. under this license. The data is organized in a set of monthly CSV files. The most recent months (January 2021 – December 2021) were used for this project.
Each file is structured into 13 columns containing information on the ride_id, ridership type, user type, start station, end station, etc.
TOOLS EMPLOYED
The tools used for this analysis include:
Microsoft Excel: used to verify the structure and integrity of each file.
RStudio Desktop: used to aggregate, analyze and clean the data.
Tableau: Used to create the visualizations.
DATA CLEANING AND MANIPULATION
For the data to be made ready for analysis, certain cleaning and manipulation were carried out.
(a) Each file was opened in Excel and reviewed to verify the number of fields, column names, data formats and to ensure data integrity.
The review revealed that
(i) there’s consistency in the column names across the 12 files,
(ii) no duplicate records were found,
(iii) there were records with missing start and end stations, and
(iv) there were records for trips starting or ending at an administrative station.
(b) The twelve files were imported into RStudio and aggregated into one single data frame. The resulting aggregated file consisted of 5,595,063 rows and 13 columns.
(c) New column, called ride_length, was created to calculate the length of each trip. The column was created by subtracting the column “started_at” from the column “ended_at”.
(d) In addition, other columns were created from the “started_at” column. These new columns are day, month, year, time, and day_of_week.
(e) Records with missing “start_station_name”, “start_station_id”, “end_station_name”, and “end_station_id” were removed. A total of 1,006,761 (18% of total) records were removed.
(f) Records for trips starting or ending at an administrative station and records where the ride_length was negative were also removed. These affected 116 rows.
(g) The new data set now contains 4,588,186 records from the initial 5,595,063 records.
The detail cleaning steps and the R codes used can be found here.
ANALYSIS
After the data has been cleaned, descriptive analysis was undertaken in RStudio to determine:
Mean, median, minimum, and maximum ride_length.
Average ride_length for members and casual riders.
Average ride_length for users by day_of_week
Number of rides for users by day_of_week
The detail R codes used can be found here.
SHARE
Tableau was used to further analyze and create visualization. The application was used to determine:
Average ride_length for members and casuals.
Average ride_length for users by day_of_week
Number of rides for users by day_of_week
Number of rides for users by month
Top 20 start stations by user type
To 20 end stations by user type.
SUMMARY OF ANALYSIS
From the visual analysis, we can see that there are several key differences on how annual members and casual riders use Cyclistic bikes.
Figure 1 shows that casual riders take longer trips than member riders. Casual riders take an average ride of 1,951 seconds (32.5 minutes) as opposed to member riders of 791 seconds (13.18 minutes).
Figure 1: Average Ride Length for Members and Casuals
In figure 2, weekends (Saturdays and Sundays) are more popular for Casual riders whereas member riders prefer week days. Sunday made up about 12% of average trips.
Figure 2: Average Ride Length for Users by Day of Week
Members take more trips than casual riders (Figure 3). Total Count of ride was higher for members (2,539,851) than casual (2,048,335). More trips was taken on Saturday than any other day. Saturday trip made up 10.21% of total ride.
Figure 3: Number of Rides for Users by Day of Week
Figure 4 shows that the winter months (December, January, and February) witness very few rides for both type of riders. The summer months (June to September) are popular with both types of riders. July is the busiest for casual riders. July made up 8.05% of total rides.
Figure 4: Number of Rides for Users by Month
In Figure 5 below, there is a notable difference in the bike preferences for member and casual riders. Both rider types preferred the Classic bikes but the member riders use the type of bike more than the casual riders. However, the docked bikes are more popular with the casual riders.
Figure 5: Bike type Usage by Riders
Figure 6: Top 20 Start Stations by User Type
Figure 7: Top 20 End Stations by User Type
OBSERVATION AND RECOMMENDATIONS
It can be observed from the analysis above that there is a clear difference in the usage of Cyclistic bikes by the casual riders and member riders. Therefore, to increase profitability by converting casual riders to annual members through a targeted marketing campaign, the following are recommended to the Cyclistic marketing team.
The marketing campaign should be targeted at the popular start and end stations for casual riders.
To reach the most casual riders, digital marketing campaign should be targeted at the busiest days (Fridays, Saturdays, and Sundays) and the most popular months (June to September). The marketing team should allocate more marketing funds to be spent during the summer months.
Digital marketing campaign should target geolocation of casual riders, review pages and social media groups of bike type (classic bikes and docked bikes) popular with the casual riders.
Cyclistic should consider building more docking stations around the popular start and end stations of the casual riders. In addition, the subscription model should be reviewed and made more appealing to casual riders.