AI clinical chat QA scoring model by Linden Jensen-PageAI clinical chat QA scoring model by Linden Jensen-Page

AI clinical chat QA scoring model

Linden Jensen-Page

Linden Jensen-Page

Summary

For a healthcare technology company, I built an AI QA Application for automated patient/clinician chat assessments to meet regulatory requirements around service quality. It autonomously evaluated 1000s of daily chats for user experience and clinical accuracy. The solution was built with Python, OpenAI API, Langchain, Pydantic, Google BigQuery, and deployed on GCP using Docker and Cloud Run. The service dramatically reduced the time taken for human reviews with the aim of replacing them altogether, saving the operations team days of work per week . Few-shot learning was used for prompt optimisation to minimise hallucinations.

Details

AI QA model was used to assess clinical and general quality of chats between health care practitioners and patients for a London based healthcare tech scale up
It was designed to process 1000s of daily chat transcripts to flag issues. The model was built around a detailed multi-step scoring rubric that performed well at scale
Few-shot training was used on labelled data to optimise the prompt to minimise hallucinations and maximise consistency of the content and structure of the outputs

Tech stack

OpenAI API, GPT-4o
Python, Langchain, Pydantic
Google Cloud Platform (GCP), Cloud Run, BigQuery
Docker
Message Bird (access to chats via API)
Like this project

Posted Feb 10, 2025

AI application for healthcare technology company to automate patient/clinician chat QA, ensuring regulatory compliance and reducing human work by days per week