Most LLM telemetry posts brag by Kelvin DesmanMost LLM telemetry posts brag by Kelvin Desman

Most LLM telemetry posts brag

Kelvin Desman

Kelvin Desman

Most LLM telemetry posts brag about what they monitor. Here's mine — including what I'm still missing.
Built this for lokerdollar.com, solo, in production:
Covered: — p50 / p95 / p99 latency per task — cost per 1k tokens + daily burn + per-task spend — provider success rates + circuit breaker state — cache hit rate, quota / rate limit tracking — model leaderboard scored on live traffic
Gaps I'm still chasing: — prompt versioning tied to quality signals — TTFT for streaming UX (matters for chat tasks) — per-user cost attribution — semantic error taxonomy, not just HTTP codes — push-based degradation alerts (today's playbook is reactive)
The lesson from running this solo: observability is never "done." It's a backlog that evolves with the product — and the gap list is more honest signal than the green checkmarks.
→ Open for AI platform / full-stack work. If you want this layer for your team, my DMs are open.
Like this project

Posted May 27, 2026

Most LLM telemetry posts brag about what they monitor. Here's mine — including what I'm still missing. Built this for lokerdollar.com (http://lokerdollar.com...