How to track LLM token usage and costs per agent, set burn-rate baselines, detect spending anomalies, and query Prometheus metrics for cost visibility.
Agent Manager’s FinOps tooling gives you real-time visibility into LLM token usage and associated costs. You can monitor spending per agent and session, set expected burn-rate baselines, and receive alerts when agent costs deviate from normal behavior.
Token-to-USD conversion rates per model. Agent Manager uses these rates to compute cost estimates from raw token counts reported by the LLM provider.
Burn rates
Real-time USD spend velocity per active session. Sliding-window accumulators track cost per hour so you can detect runaway sessions before they become costly.
Historical trends
Daily cost aggregations over configurable trailing windows (7, 30, or up to 90 days) broken down by agent and organization.
Anomaly detection
Sessions whose burn rate exceeds a registered agent baseline by a configurable multiplier are flagged as anomalies, visible in the dashboard and via API.
Set baselines after running your agents in normal conditions for a few days. Use the historical trends endpoint to determine a representative USD/hour figure for each agent.
Returns system status including database connectivity, Docker availability (for the code sandbox), and any configured API provider health.
Use the cache impact time-series endpoint (GET /api/v1/finops/cache-impact) to measure how effectively your agents are leveraging semantic caching. Higher cache hit rates directly reduce LLM spend.