Healthcare Data Analysis

Marcela

Marcela Fornazari

Efficient patient flow management is crucial for hospitals. With limited beds, hospitals must balance quality care, financial sustainability, and equitable treatment for all patients. Understanding patient stays, medical procedures, demographic disparities, and success stories is key to achieving these goals.
In this analysis, I'm using this dataset and I'm going to use SQL to answer some questions on how we can effectively run the operations of a hospital.
1. Length of Stay Distribution
Beds are finite resources, and once they're full, the hospital can't admit more patients, affecting both care quality and revenue. To address this issue, the hospital has two options: either expand their facilities by building more rooms and beds, which is a costly and time-consuming process or discharge some patients to free up space, which is simpler and equally effective.
To understand patient lengths of stay, I analyzed a dataset using SQL. By creating a histogram, I visualized the distribution of hospital stays. I want to know if the majority of the patients stay less or more than 7 days. Remarkably, the majority of patients spend less than 7 days in the hospital, indicating efficient turnover.
SQL Function Used: Histogram creation with COUNT and GROUP BY:
2. Most Performed Medical Specialties
Medical procedures are the biggest costs of a hospital. By writing this query, I identified which specialties perform the most procedures on average. To ensure meaningful results, I filtered specialties with more than 50 patients and more than 2.5 medical procedures. Based on the results of my query, cardiology is the medical procedure that the hospital is spending the most.
SQL Function Used: Filtering with HAVING:
3. Analysis of Demographic Disparities
I investigated whether the hospital treats patients of different races differently, specifically regarding the number of lab procedures. For this analysis, I needed to combine the healthcare table with another one containing demographic data of the population treated in the hospital. Upon reviewing the results, I didn't observe any significant disparities between different racial groups.
SQL Function Used: Joining tables with JOIN:
4. Correlation between Lab Procedures and Length of Stay
Now, let's shed light on how the number of lab procedures might relate to the length of hospital stays. Do patients who take more lab procedures tend to stay in the hospital longer? To investigate this, I classified patients as follows:
Few procedures: between 0 and 24
Average procedures: between 25 and 54
Many procedures: above 55
Based on the findings, it appears there is a correlation between these factors. The longer a patient is at the hospital, the more lab procedures they have.
SQL Function Used: Categorization with CASE WHEN:
5. Fast Identification of Specific Patient Groups
In case the hospital needs to identify patients meeting specific criteria: Hispanic ethnicity who can take metformin, a medicine used to treat type 2 diabetes and gestational diabetes. Utilizing UNION, I combined relevant data from demographics and medical records, and I was able to come up with a list of these patients.
SQL Function Used: Combining datasets with UNION:
6. Highlighting Hospital Success Stories
Now, let's consider the scenario where the hospital administrator seeks to showcase notable success stories from the hospital. They aim to spotlight instances where patients arrived under emergency conditions but had shorter than average stays.
SQL Function Used: Common Table Expressions (CTEs) for query organization:
Some Other Thoughts
I used SQL for this analysis, which is currently one of my favorite languages to manipulate data. Like most people, JOIN functions still give me a bit of a headache. It keeps getting easier with time and practice though. Hope you have enjoyed the reading, and please let me know if there are any interesting topics that I should dig into next!
Like this project

Posted Aug 30, 2025

Analyzed hospital data using SQL to improve patient flow and operational efficiency.