> ## Documentation Index
> Fetch the complete documentation index at: https://operativusai.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

# FinOps: Budget Controls and Usage Tracking

> How to track LLM token usage and costs per agent, set burn-rate baselines, detect spending anomalies, and query Prometheus metrics for cost visibility.

Agent Manager's FinOps tooling gives you real-time visibility into LLM token usage and associated costs. You can monitor spending per agent and session, set expected burn-rate baselines, and receive alerts when agent costs deviate from normal behavior.

## What FinOps tracks

<CardGroup cols={2}>
  <Card title="Valuation rates" icon="dollar-sign">
    Token-to-USD conversion rates per model. Agent Manager uses these rates to compute cost estimates from raw token counts reported by the LLM provider.
  </Card>

  <Card title="Burn rates" icon="fire">
    Real-time USD spend velocity per active session. Sliding-window accumulators track cost per hour so you can detect runaway sessions before they become costly.
  </Card>

  <Card title="Historical trends" icon="chart-line">
    Daily cost aggregations over configurable trailing windows (7, 30, or up to 90 days) broken down by agent and organization.
  </Card>

  <Card title="Anomaly detection" icon="triangle-exclamation">
    Sessions whose burn rate exceeds a registered agent baseline by a configurable multiplier are flagged as anomalies, visible in the dashboard and via API.
  </Card>
</CardGroup>

## Viewing cost data

**Historical cost trends** (trailing N days, default 7):

```bash theme={null}
GET /api/v1/finops/trends?days=30
```

**Cost allocation by agent and org:**

```bash theme={null}
GET /api/v1/finops/allocations?days=7
```

**Cost allocation broken down by LLM model:**

```bash theme={null}
GET /api/v1/finops/allocations/by-model?days=7
```

**Active session burn rates:**

```bash theme={null}
GET /api/v1/finops/burn-rates/active
```

Returns one entry per active session with its cumulative USD spend within the current observation window.

**Cache ROI statistics:**

```bash theme={null}
GET /api/v1/finops/roi-stats
```

Returns accumulated cache savings (USD), embedding costs (USD), and net ROI since the last application restart.

## Configuring valuation rates

Agent Manager computes cost estimates using per-model token-to-USD rates. Retrieve the current rate table:

```bash theme={null}
GET /api/v1/finops/valuation-rates
```

Register or update a model's rates:

```bash theme={null}
curl -X PUT http://your-host/api/v1/finops/valuation-rates \
  -H "Authorization: Bearer {token}" \
  -H "Content-Type: application/json" \
  -d '{
    "modelId": "gpt-4o",
    "inputRatePerKTokens": 0.0025,
    "outputRatePerKTokens": 0.01,
    "cachedInputRatePerKTokens": 0.00125,
    "reasoningRatePerKTokens": 0.01
  }'
```

Rate updates take effect immediately — the new values are stored in an in-memory concurrent cache and applied to all subsequent runs.

## Setting agent burn-rate baselines

Baselines define the expected normal USD/hour spend for an agent. Agent Manager uses baselines to identify anomalous sessions.

```bash theme={null}
curl -X PUT http://your-host/api/v1/finops/baselines/{agentId} \
  -H "Authorization: Bearer {token}" \
  -H "Content-Type: application/json" \
  -d '{
    "baselineUsdPerHour": 0.50
  }'
```

<Note>
  Set baselines after running your agents in normal conditions for a few days. Use the historical trends endpoint to determine a representative USD/hour figure for each agent.
</Note>

## Anomaly detection

When a session's burn rate exceeds its agent's baseline by a configured multiplier, it appears as an active anomaly:

```bash theme={null}
GET /api/v1/finops/anomalies/active
```

```json theme={null}
[
  {
    "sessionId": "session-uuid",
    "agentId": "finance_agent",
    "burnRateUsdPerHour": 4.20,
    "baselineUsdPerHour": 0.50,
    "anomalyRatio": 8.4
  }
]
```

An empty array means no sessions are currently anomalous.

## Prometheus metrics

Agent Manager exposes FinOps data via Prometheus at the standard actuator endpoint:

```bash theme={null}
GET /actuator/prometheus
```

Key metrics:

| Metric                      | Type    | Description                             |
| --------------------------- | ------- | --------------------------------------- |
| `agent.runs`                | Counter | Total agent run count                   |
| `agent.tool.calls`          | Counter | Total tool invocations                  |
| `finops.cache.savings.usd`  | Counter | Cumulative USD saved via semantic cache |
| `finops.embedding.cost.usd` | Summary | Cumulative USD spent on embeddings      |

## Health check

```bash theme={null}
GET /actuator/health
```

Returns system status including database connectivity, Docker availability (for the code sandbox), and any configured API provider health.

<Tip>
  Use the cache impact time-series endpoint (`GET /api/v1/finops/cache-impact`) to measure how effectively your agents are leveraging semantic caching. Higher cache hit rates directly reduce LLM spend.
</Tip>
