A real-time agent monitoring platform that gives engineering teams full visibility into every agent span, from token counts to trace waterfalls. Once you deploy an agentic AI system you're flying blind: token usage, latency, and failure points are invisible without purpose-built instrumentation. This fills that gap.
Architecture
A gRPC streaming pipeline (grpcio-aio + Envoy) ingests span events from instrumented agents with sub-second latency. An exportable Python SDK (AgentTracer) wraps any agentic workflow with a context-manager API: zero-friction instrumentation with no changes to agent logic.
What it tracks per span
Token counts (input, output, total)
Latency per agent step
Outcomes and error states
Full trace waterfall across multi-agent runs
Security and access control
Auth layer uses OIDC (Authorization Code Flow + PKCE) with attribute-based access control (ABAC). Clearance level, department, and ownership policies enforced per span at query time. Redis for PKCE state and session revocation.
Frontend
React/TypeScript dashboard with live token usage charts and a span waterfall trace viewer.