RFM Analysis of a Healthcare Dataset Using Python

Duncan Mathenge

0

Content Writer

Technical Writer

Jupyter

The dataset contains data of 4 different diseases and the cost patients from 10 different states incurred during treatment in July 2023. The first step is installing the necessary Python libraries and the dataset:
PatientID TreatmentDate AmountSpent in $ DiseaseType TreatmentNumber \ 0 4600 2023-07-17 28226 Heart disease 562915
1 6302 2023-07-17 979 Hypertension 563003
2 2456 2023-07-17 947 Hypertension 878136
3 5602 2023-07-17 703 Diabetes 120090
4 1811 2023-07-17 27380 Lung Cancer 140526
0 California
1 Texas
2 Kentucky
3 Delaware
4 Alaska
Calculating RFM Values The next step is to calculate the Recency, Frequency, and Monetary values of the patients. from datetime import datetime
PatientID TreatmentDate AmountSpent in $ DiseaseType TreatmentNumber \ 0 4600 2023-07-17 28226 Heart disease 562915
1 6302 2023-07-17 979 Hypertension 563003
2 2456 2023-07-17 947 Hypertension 878136
3 5602 2023-07-17 703 Diabetes 120090
4 1811 2023-07-17 27380 Lung Cancer 140526
0 California 92 days 1 28226 1 Texas 92 days 1 979 2 Kentucky 92 days 1 947
3 Delaware 92 days 1 703
4 Alaska 92 days 1 27380
Calculating RFM Scores This step involves calculating the recency, frequency, and monetary scores
PatientID TreatmentDate AmountSpent in $ DiseaseType TreatmentNumber \ 0 4600 2023-07-17 28226 Heart disease 562915
1 6302 2023-07-17 979 Hypertension 563003
2 2456 2023-07-17 947 Hypertension 878136
3 5602 2023-07-17 703 Diabetes 120090
4 1811 2023-07-17 27380 Lung Cancer 140526
0 California 92 days 1 28226 1 1 1 Texas 92 days 1 979 1 1 2 Kentucky 92 days 1 947 1 1 3 Delaware 92 days 1 703 1 1 4 Alaska 92 days 1 27380 1 1 MonetaryScore RFM_Score Value Segment
0 2 4 Low-Value
1 1 3 Low-Value
2 1 3 Low-Value
3 1 3 Low-Value
4 2 4 Low-Value
Combining RFM Scores
PatientID TreatmentDate AmountSpent in $ TreatmentNumber
0 4600 2023-07-17 28226 562915
1 6302 2023-07-17 979 563003
2 2456 2023-07-17 947 878136
3 5602 2023-07-17 703 120090
4 1811 2023-07-17 27380 140526
0 92 days 1 28226 1 1 1 92 days 1 979 1 1 2 92 days 1 947 1 1 3 92 days 1 703 1 1 4 92 days 1 27380 1 1 MonetaryScore RFM_Score 0 2 4
1 1 3
2 1 3
3 1 3
4 2 4
Lastly, the RFM_score column combines the individual scores for recency, frequency, and monetary value into a single RFM score. This score can be used to segment customers and gain insights into their behavior and preferences.
Like this project
0

Posted Aug 1, 2024

The dataset contains data of 4 different diseases and the cost patients from 10 different states...

Likes

0

Views

2

Tags

Content Writer

Technical Writer

Jupyter

ship construction final11
ship construction final11
Kaggle uses cookies from Google to deliver and enhance the qual…
Kaggle uses cookies from Google to deliver and enhance the qual…