Monitoring, and troubleshooting ~200,000 distributed Linux server network
Understanding, analysing the root cause of a complex alert in s distributed system and then solving it
Automating and responding to alerts and procedures as per the pre-defined prority levels and SLAs.