Google Data Analytics Capstone Project: Cyclistic Bike-Share Case Study
--
Introduction
In December 2021, I enrolled in the Google Data Analytics Professional Certificate offered on Coursera in order to sharpen my data analytics skills. The course introduced me to tools I had never used before as well as sharpening of the tools I used to work with. I got to learn how to code with R, create compelling visualizations in R and Tableau, use SQL for data wrangling and cleaning, ask effective questions, gain valuable insights and make recommendations.
Google recommends that everyone taking the course should create their own Capstone Project to highlight the skills they have gained from the course as well as to get a portfolio that can be easily seen by potential employers. I chose the Cyclistic Bike-Share Case Study for my Capstone Project.
Background
The Cyclistic bike-share case analysis study requires me to perform the tasks of a real-world junior data analyst in the marketing department of a fictional company, Cyclistic. As the Junior Data Analyst of the company, I will meet different characters and team members.
The director of marketing believes the company’s future success depends on maximizing the number of annual memberships. Therefore, my team wants to understand how casual riders and annual members use Cyclistic bikes differently.
From these insights, my team will design a new marketing strategy to convert casual riders into annual members. But first, Cyclistic executives
must approve your recommendations, so they must be backed up with compelling data insights and professional data
visualizations.
Business Task
Analyze how annual members and casual riders differ in their use of Cyclistic bikes in order to create a marketing campaign to convert casual riders into annual members.
Data Sources
The data was made available online at https://divvy-tripdata.s3.amazonaws.com/index.html by Motivate International Inc under this license. Motivate operates the City of Chicago’s Divvy bicycle sharing service.
The data was stored in a csv format. It is a structured data organized in rows and columns. The data is gotten from a first party source hence it is credible and free of bias.
Data Cleaning and Manipulation
I chose R for the data cleaning and manipulation due to the large number of rows (over 5 million rows) and also for quick visualization.
I downloaded the Cyclistic bike trip data for the previous 12 months, that is, January 2021 to December 2021. I then unzipped each of the 12 files before compressing all 12 individual files into a zip file for ease of upload into R Studio. On R Studio, the zipped files automatically became uncompressed into the individual csv files for each month of 2021.
Each csv file has 13 columns of data. On combination, there are 13 columns and 5, 595, 063 rows of data.
The combined dataframe includes a few hundred entries where bikes were taken out for maintenance or where the trip duration was negative. These anomalies were removed from the dataset. Also, 6 new fields were added to the dataset. Date, month, day, year, day_of_week and ride_length columns were added to the dataset. The new combined data frame now has 5, 594, 410 rows and 19 columns.
As can be seen in the chart below, casual riders are more active on weekends (Saturday and Sunday) while the members are more active during midweek (Tuesday, Wednesday and Thursday).
It can be seen below that the casual riders have a significantly higher trip duration compared to the members.
The maximum trip duration weekly further buttresses the point that the casual riders are most active during the weekends.
As can be seen below, casual riders tend to spend more time on the rides in the first six days of the month.
The number of rides completed each day has no noticeable pattern.
Recommendations
- To convert the casual riders to members, Cyclistic should reward casual riders with long trip durations discounted offers to become members.
- Cyclistic should also have promotional offers in the first week of each month which rewards casual riders with prizes for having the longest trip duration for each day. These promotional offers can only be claimed if the user is a member.
- Cyclistic should target weekends for granting free and discounted membership offers.