Real-Time Streaming Chat with AI Agents

Agent Manager streams responses using Server-Sent Events (SSE), so your application can render the agent’s answer incrementally as tokens arrive rather than waiting for the full response. The same stream also exposes the agent’s internal reasoning and tool activity, giving you full visibility into how the agent is working.

How streaming works

When you call the stream endpoint, the server holds the connection open and pushes a sequence of AgentStreamEvent objects as newline-delimited SSE messages. Each event has a discriminator field (event) that tells you what kind of data it carries:

POST /api/agents/{agentId}/runs/stream
Content-Type: application/json
Accept: text/event-stream

The request body is identical to a synchronous run. The only difference is the endpoint path and the Accept header.

Event structure

Every SSE message contains a single AgentStreamEvent:

event

string

required

The event type. See the table below for all possible values.

data

string

The payload for this event. For delta events this is a text fragment; for tool events it is a JSON string.

timestamp

number

Unix timestamp (milliseconds) when the event was emitted by the server.

Event types

Event	Description
`START`	The stream has been initialized. No user-visible data.
`REASONING_DELTA`	A fragment of the agent’s inner reasoning — what it is thinking before calling a tool.
`CONTENT_DELTA`	A fragment of the final answer text. Concatenate these to build the full response.
`TOOL_START`	The agent is about to call a tool. `data` contains the tool name and input as JSON.
`TOOL_END`	A tool call has completed. `data` contains the tool output as JSON.
`STOP`	The stream is complete. No more events will follow.
`ERROR`	An error occurred. `data` contains an error message.

REASONING_DELTA events expose the agent’s “inner thoughts” — the chain-of-thought reasoning it produces before deciding which tool to call. You can render these separately (for example, in a collapsible “Thinking…” section) to show users how the agent reached its conclusion.

Consuming the stream in TypeScript

The example below uses the EventSource API (or a polyfill that supports POST with a body) to consume the stream and separate reasoning from content:

interface AgentStreamEvent {
  event: 'START' | 'REASONING_DELTA' | 'CONTENT_DELTA' | 'TOOL_START' | 'TOOL_END' | 'STOP' | 'ERROR';
  data: string;
  timestamp: number;
}

async function streamAgentResponse(
  agentId: string,
  message: string,
  sessionId: string,
  onReasoning: (chunk: string) => void,
  onContent: (chunk: string) => void,
  onDone: () => void
): Promise<void> {
  const response = await fetch(`/api/agents/${agentId}/runs/stream`, {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      Accept: 'text/event-stream',
    },
    body: JSON.stringify({ message, sessionId }),
  });

  if (!response.body) throw new Error('No response body');

  const reader = response.body.getReader();
  const decoder = new TextDecoder();
  let buffer = '';

  while (true) {
    const { done, value } = await reader.read();
    if (done) break;

    buffer += decoder.decode(value, { stream: true });
    const lines = buffer.split('\n');
    buffer = lines.pop() ?? '';

    for (const line of lines) {
      if (!line.startsWith('data:')) continue;
      const payload = line.slice('data:'.length).trim();
      if (!payload) continue;

      const evt: AgentStreamEvent = JSON.parse(payload);

      switch (evt.event) {
        case 'REASONING_DELTA':
          onReasoning(evt.data);
          break;
        case 'CONTENT_DELTA':
          onContent(evt.data);
          break;
        case 'STOP':
          onDone();
          break;
        case 'ERROR':
          throw new Error(evt.data);
      }
    }
  }
}

The built-in UI at http://localhost:5173 already implements this pattern. It renders reasoning steps in a collapsible panel, streams final answer tokens with Markdown formatting, and shows HITL approval controls when a run reaches PAUSED status.

Multimodal input

For agents that support vision, you can include image attachments in the request body alongside your message. Images must be base64-encoded or referenced by a publicly accessible URL.

Request body with media

{
  "message": "What does this chart show?",
  "sessionId": "session-abc-123",
  "media": [
    {
      "type": "image/png",
      "data": "iVBORw0KGgoAAAANSUhEUgAA..."
    }
  ]
}

media[].type

string

required

The MIME type of the attachment, such as image/png or image/jpeg.

media[].data

string

required

Base64-encoded binary content of the file, or an HTTP/HTTPS URL pointing to the image.

You can include multiple media items in a single request:

{
  "message": "Compare these two screenshots",
  "media": [
    { "type": "image/png", "data": "<base64_image_1>" },
    { "type": "image/png", "data": "<base64_image_2>" }
  ]
}

Not all agents are configured with vision-capable models. Sending media to a text-only agent will result in an error. Check the agent’s configuration or use procurator_assistant to ask which agents support multimodal input.

Full streaming example

The following end-to-end example opens a stream, collects reasoning and content separately, and prints them to the console:

let reasoning = '';
let answer = '';

await streamAgentResponse(
  'finance_agent',
  'Summarize the last 5 days of TSLA trading activity.',
  crypto.randomUUID(),
  (chunk) => {
    reasoning += chunk;
    process.stdout.write(`[thinking] ${chunk}`);
  },
  (chunk) => {
    answer += chunk;
    process.stdout.write(chunk);
  },
  () => {
    console.log('\n\n--- Stream complete ---');
    console.log('Full reasoning:', reasoning);
    console.log('Full answer:', answer);
  }
);

Built-in chat UI

The Agent Manager UI (http://localhost:5173) provides a production-ready chat interface with:

Live streaming

Tokens render incrementally as they arrive. Reasoning steps appear in a collapsible “Thinking…” panel above the final answer.

Markdown rendering

Responses are rendered with full Markdown support: tables, code blocks with syntax highlighting, and lists.

Multimodal input

Drag and drop image files directly into the chat input. The UI handles base64 encoding automatically.

HITL controls

When a run is paused for approval, the UI displays a visual indicator with “Approve” and “Reject” buttons.

Get Started

Core Features

Knowledge & Memory

Security & Compliance

Platform

Real-Time Streaming Chat with AI Agents

How streaming works

Event structure

Event types

Consuming the stream in TypeScript

Multimodal input

Request body with media

Full streaming example

Built-in chat UI

Live streaming

Markdown rendering

Multimodal input

HITL controls

Get Started

Core Features

Knowledge & Memory

Security & Compliance

Platform

Documentation Index

​How streaming works

​Event structure

​Event types

​Consuming the stream in TypeScript

​Multimodal input

​Request body with media

​Full streaming example

​Built-in chat UI

Live streaming

Markdown rendering

Multimodal input

HITL controls

How streaming works

Event structure

Event types

Consuming the stream in TypeScript

Multimodal input

Request body with media

Full streaming example

Built-in chat UI