Unveiling Cyclistic Bike-Share Insights Through Data Analysis

Aziz Ba Bakr

Useful Links:
(Note: The below blog is a summary of the SQL analysis conducted. For the complete code and detailed insights, please refer to my GitHub Repository in the Useful Links section above)
(Note: The below blog is a summary of the Power BI dashboard created. For the complete visual experience, please refer to the links provided above in the Useful Links section)
Introduction:
Cyclistic, a prominent bike-share program in Chicago, has evolved into a thriving network of over 5,800 bicycles across 692 stations. The company is eager to enhance its marketing strategy and maximize its annual membership base. This article delves into the data-driven journey of Cyclistic, exploring SQL queries and analytics to uncover valuable insights that can influence decision-making.
Combining and Cleaning Data:
The analysis began with consolidating Cyclistic’s ride data from multiple months in 2022 into a single table named “combined_data.” This step eliminated data fragmentation and enabled comprehensive analysis.
-- Appending all year table into one
Create Table combined_data
(
[ride_id] varchar(50) Primary Key,
[rideable_type] varchar(50),
[started_at] datetime,
[ended_at] datetime,
[start_station_name] varchar(100),
[start_station_id] varchar(50),
[end_station_name] varchar(100),
[end_station_id] varchar(50),
[start_lat] float,
[start_lng] float,
[end_lat] float,
[end_lng] float,
[member_casual] varchar(50)
)

Insert Into combined_data
(
[ride_id],
[rideable_type],
[started_at],
[ended_at],
[start_station_name],
[start_station_id],
[end_station_name],
[end_station_id],
[start_lat],
[start_lng],
[end_lat],
[end_lng],
[member_casual]
)

Select * from Cyclistic.dbo.[2022-01-tripdata]
Union All
Data Integrity and Normalization:
Ensuring data integrity was vital. Negative ride durations were rectified, and duplicate records were addressed. Tables for ride types, stations, user categories, and calendar dates were created for comprehensive analysis.
-- Getting rid of unwanted columns - Data Cleaning
Create Table TripData
(
ride_id varchar(50) Primary Key,
rideable_type varchar(50),
started_at datetime,
ended_at datetime,
start_station_name varchar(100),
start_station_id varchar(50),
end_station_name varchar(100),
end_station_id varchar(50),
user_cat varchar(50)
)

Insert Into TripData
(
ride_id,
rideable_type,
started_at,
ended_at,
start_station_name,
start_station_id,
end_station_name,
end_station_id,
user_cat
)

Select
ride_id,
rideable_type,
started_at,
ended_at,
start_station_name,
start_station_id,
end_station_name,
end_station_id,
user_cat = member_casual
From
combined_data
Joining Tables for Enhanced Insights:
Merging the normalized tables provided a holistic view of ride data, encompassing factors like rideable type, day of the week, ride length, and user category. This enriched dataset facilitated deeper explorations.
-- Joining tables all together
Select
a.ride_id,
b.rideable_type_id,
a.rideable_type,
h.day_of_week_name,
h.day_of_week_number,
h.day__of_month_number,
h.weekend,
h.month_number,
h.year_number,
c.start_date,
c.start_time,
c.end_date,
c.end_time,
c.start_station_id,
c.start_station_name,
c.end_station_id,
c.end_station_name,
j.user_cat as user_category

Into joined_tables
From master_table a
Left Join rideable_type b
On a.rideable_type = b.distinct_rideable_type

Left Join ride_data c
On a.ride_id = c.ride_id

Left Join calendar h
On c.start_date = h.date_value

Left Join user_cat_data j
On a.user_cat = j.user_cat
Exploratory Data Analysis (EDA):
The EDA phase uncovered significant patterns. Average ride lengths, the distribution of ride durations, and preferences of member vs. casual riders were studied across months, days of the week, and hours.
-- Aggregated Summary for exportation & Visualization
Select
a.user_category,
a.rideable_type,
DATEFROMPARTS(2022, a.month_number, a.day__of_month_number) as date,
DATEPART(QUARTER, a.start_date) as quarter,
DATENAME(MONTH, a.start_date) as month,
a.day__of_month_number,
a.day_of_week_number,
a.day_of_week_name,
DATEPART(hour,(CAST(a.start_time as time))) as hour,
SUM(a.min_diff) as total_lengths,
avg(a.min_diff) AS average_ride_length,
count(a.ride_id) as ride_id_count,
COUNT(case when a.min_diff <= 10 then 'short ride' end) as short_rides,
count(case when a.min_diff > 10 and a.min_diff <= 15 then 'average ride' end) as average_rides,
COUNT(case when a.min_diff > 15 then 'long ride' end) as long_rides

From
joined_tables a
Group by
a.user_category,
a.rideable_type,
a.month_number,
DATEPART(QUARTER, a.start_date),
DATENAME(MONTH, a.start_date),
a.day__of_month_number,
a.day_of_week_number,
a.day_of_week_name,
DATEPART(hour,(CAST(a.start_time as time)))

Order by
a.month_number,
DATEPART(

QUARTER, a.start_date),
a.day__of_month_number,
a.day_of_week_number,
DATEPART(hour,(CAST(a.start_time as time)))
Bike Usage Trends:
Key insights emerged, including bike preferences, average ride lengths, and trends in user categories. Data suggested that users preferred shorter rides, with casual riders contributing to a significant portion of the user base.
Monthly Insights:
Monthly analyses revealed seasonal trends, indicating increased bike usage during certain periods. These insights could inform marketing campaigns and station management strategies.
Day-of-Week Trends:
Day-of-week analysis exposed fluctuations in usage based on weekdays and weekends. It became evident that members and casual riders exhibit different usage patterns.
Hourly Usage Patterns:
Analyzing hourly usage patterns highlighted peak times when users accessed bikes. Insights into hourly trends could be leveraged for efficient bike redistribution and maintenance.
Segmented Ride Analysis:
Rides were categorized as short, average, or long based on their duration. This categorization revealed member and casual rider preferences for ride length.
Aggregated Summary:
The article concluded with a comprehensive summary of insights. Aggregated data on ride lengths, user categories, and preferred ride times were presented. This summarized view offered a holistic snapshot of Cyclistic’s bike-share program performance.
Conclusion:
By delving into Cyclistic’s historical ride data, this analysis unveiled valuable insights that can steer the company’s marketing strategies, optimize station management, and enhance user experiences. These insights serve as a testament to the power of data-driven decision-making in transforming a bike-sharing program’s success.
Power BI Visualization Report:
User Behavior Insights: The Power BI report’s analysis of ride lengths over various time intervals — months, quarters, days of the week, and hours — has revealed significant engagement trends. This knowledge can be leveraged to maximize user interactions by targeting peak usage periods and tailoring promotions to these moments of heightened activity.
User Category Comparison: The contrasting ride lengths and frequencies between member and casual users have unveiled distinctive preferences among these segments. Armed with this understanding, Cyclistic can strategize effectively to convert casual riders into annual members, thereby tapping into the superior profitability associated with the latter.
Rideable Type Preferences: My examination of rideable type data has spotlighted the favored bike types among users. This granular understanding empowers Cyclistic to streamline bike inventory distribution across stations, ensuring that popular stations are well-stocked with the preferred bike models during periods of high demand.
Strategic Insights: With my insights focused on converting casual riders into annual members, Cyclistic can optimize its marketing campaigns to align with this overarching goal. By directing resources towards transforming existing users into members, I can capitalize on their pre-existing awareness of Cyclistic’s program.
Data-Driven Decision-Making: My analysis’s strength lies in its data-driven nature. The Power BI report provides the tools to explore data across various dimensions, allowing stakeholders to tailor analyses and gain deeper insights into user behavior, preferences, and engagement patterns.
Conclusion: In conclusion, this Power BI report isn’t just a collection of charts and figures — it’s a gateway to transformation. Armed with these insights, Cyclistic can pave the way for increased ridership, enhanced user experiences, and operational efficiency. The data’s tale is told in the visuals, and the actions it beckons are bound to shape the future of Cyclistic in profound ways. By embracing data-driven decision-making, Cyclistic can truly pedal toward a future of growth and success.
Note:
The SQL queries and analyses mentioned in this article are based on the provided data and have been tailored for explanatory purposes. Actual results may vary based on the data’s quality and completeness. The article aims to showcase the methodology and approach to data analysis.
Like this project
0

Posted Jan 28, 2025

This article delves into the data-driven journey of Cyclistic, exploring SQL queries and analytics to uncover valuable insights that can help decision-making

Exploring Customer Data for Profit Maximization
Exploring Customer Data for Profit Maximization
Analyzing Ben’s Pizzeria: A Data Analysis Case Study
Analyzing Ben’s Pizzeria: A Data Analysis Case Study
Exploring AdventureWorks Dataset
Exploring AdventureWorks Dataset