The dataset contains data of 4 different diseases and the cost patients from 10 different states incurred during treatment in July 2023. The first step is installing the necessary Python libraries and the dataset:
0 California 92 days 1 28226 1 1
1 Texas 92 days 1 979 1 1
2 Kentucky 92 days 1 947 1 1
3 Delaware 92 days 1 703 1 1
4 Alaska 92 days 1 27380 1 1
MonetaryScore RFM_Score Value Segment
0 2 4 Low-Value
1 1 3 Low-Value
2 1 3 Low-Value
3 1 3 Low-Value
4 2 4 Low-Value
Combining RFM Scores
PatientID TreatmentDate AmountSpent in $ TreatmentNumber
0 4600 2023-07-17 28226 562915
1 6302 2023-07-17 979 563003
2 2456 2023-07-17 947 878136
3 5602 2023-07-17 703 120090
4 1811 2023-07-17 27380 140526
0 92 days 1 28226 1 1
1 92 days 1 979 1 1
2 92 days 1 947 1 1
3 92 days 1 703 1 1
4 92 days 1 27380 1 1
MonetaryScore RFM_Score
0 2 4
1 1 3
2 1 3
3 1 3
4 2 4
Lastly, the RFM_score column combines the individual scores for recency, frequency, and monetary value into a single RFM score. This score can be used to segment customers and gain insights into their behavior and preferences.