I architected and built out an AI system to automatically classify failed data pipelines across all data centers globally. I also built out a system to automatically fix (where possible) failed data pipelines, without the need for human intervention.
The impact was that we could now identify 90% of all data pipeline failures automatically and fix 30% of those failures. This project was deployed globally at Meta, so the system was scanning, classifying and fixing 500mm data pipeline runs per month. This saved thousnads of man hours each month.