RFM Analysis of a Healthcare Dataset Using Python

Duncan Mathenge

Content Writer
Technical Writer
Jupyter
The dataset contains data of 4 different diseases and the cost patients from 10 different states incurred during treatment in July 2023. The first step is installing the necessary Python libraries and the dataset:
PatientID TreatmentDate AmountSpent in $ DiseaseType TreatmentNumber \ 0 4600 2023-07-17 28226 Heart disease 562915
1 6302 2023-07-17 979 Hypertension 563003
2 2456 2023-07-17 947 Hypertension 878136
3 5602 2023-07-17 703 Diabetes 120090
4 1811 2023-07-17 27380 Lung Cancer 140526
0 California
1 Texas
2 Kentucky
3 Delaware
4 Alaska
Calculating RFM Values The next step is to calculate the Recency, Frequency, and Monetary values of the patients. from datetime import datetime
PatientID TreatmentDate AmountSpent in $ DiseaseType TreatmentNumber \ 0 4600 2023-07-17 28226 Heart disease 562915
1 6302 2023-07-17 979 Hypertension 563003
2 2456 2023-07-17 947 Hypertension 878136
3 5602 2023-07-17 703 Diabetes 120090
4 1811 2023-07-17 27380 Lung Cancer 140526
0 California 92 days 1 28226 1 Texas 92 days 1 979 2 Kentucky 92 days 1 947
3 Delaware 92 days 1 703
4 Alaska 92 days 1 27380
Calculating RFM Scores This step involves calculating the recency, frequency, and monetary scores
PatientID TreatmentDate AmountSpent in $ DiseaseType TreatmentNumber \ 0 4600 2023-07-17 28226 Heart disease 562915
1 6302 2023-07-17 979 Hypertension 563003
2 2456 2023-07-17 947 Hypertension 878136
3 5602 2023-07-17 703 Diabetes 120090
4 1811 2023-07-17 27380 Lung Cancer 140526
0 California 92 days 1 28226 1 1 1 Texas 92 days 1 979 1 1 2 Kentucky 92 days 1 947 1 1 3 Delaware 92 days 1 703 1 1 4 Alaska 92 days 1 27380 1 1 MonetaryScore RFM_Score Value Segment
0 2 4 Low-Value
1 1 3 Low-Value
2 1 3 Low-Value
3 1 3 Low-Value
4 2 4 Low-Value
Combining RFM Scores
PatientID TreatmentDate AmountSpent in $ TreatmentNumber
0 4600 2023-07-17 28226 562915
1 6302 2023-07-17 979 563003
2 2456 2023-07-17 947 878136
3 5602 2023-07-17 703 120090
4 1811 2023-07-17 27380 140526
0 92 days 1 28226 1 1 1 92 days 1 979 1 1 2 92 days 1 947 1 1 3 92 days 1 703 1 1 4 92 days 1 27380 1 1 MonetaryScore RFM_Score 0 2 4
1 1 3
2 1 3
3 1 3
4 2 4
Lastly, the RFM_score column combines the individual scores for recency, frequency, and monetary value into a single RFM score. This score can be used to segment customers and gain insights into their behavior and preferences.
Partner With Duncan
View Services

More Projects by Duncan