I implemented a clean, modular stack using Prometheus, Grafana, Loki, and Alertmanager. All components were provisioned via Ansible and containerized for portability. We defined meaningful alerting thresholds, routed pages based on severity and ownership, and built dashboards that weren’t just “graphs for graphs’ sake.”