> ## Documentation Index
> Fetch the complete documentation index at: https://operativusai.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

# Building a Knowledge Base for RAG-Powered Agents

> Upload PDF and text documents into vector storage so your agents can retrieve accurate, domain-specific information using Active RAG on demand.

A knowledge base is a collection of documents ingested into vector storage that your agents search at runtime. When an agent needs domain-specific information, it calls the built-in `search_knowledge_base` tool, retrieves the most relevant passages from your documents, and uses that context to generate accurate answers. This approach—where the agent decides when to look something up—is called **Active RAG**.

## How agents use knowledge

When you attach a knowledge base to an agent, the agent is given access to the `search_knowledge_base(query)` tool. The agent autonomously decides when retrieval is necessary: it issues a query, receives semantically matched document chunks, and incorporates the results into its response. You do not need to configure any injection logic—retrieval happens automatically as part of the agent's reasoning loop.

<Info>
  Active RAG gives agents the judgment to look up information only when relevant, rather than blindly prepending document context to every prompt.
</Info>

## Uploading documents

Use `POST /api/knowledge/upload` with a `multipart/form-data` body. The `file` parameter is required. Supported formats are **PDF** and **TXT**.

The endpoint returns immediately with a `PROCESSING` status and a `documentId`. Ingestion—text extraction, chunking, and vector embedding—runs asynchronously in the background.

<CodeGroup>
  ```bash Upload a PDF theme={null}
  curl -X POST http://localhost:8080/api/knowledge/upload \
    -F "file=@company-policy.pdf"
  ```

  ```bash Upload a TXT file theme={null}
  curl -X POST http://localhost:8080/api/knowledge/upload \
    -H "Authorization: Bearer {token}" \
    -F "file=@onboarding-guide.txt"
  ```
</CodeGroup>

**Response**

```json theme={null}
{
  "documentId": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "status": "PROCESSING",
  "message": "Upload processing started"
}
```

<ParamField body="file" type="file" required>
  The document to upload. Must be a non-empty PDF or TXT file.
</ParamField>

<ParamField body="knowledgeBaseId" type="string">
  UUID of the knowledge base to associate this document with. Omit to add to the default knowledge base.
</ParamField>

<ParamField body="description" type="string">
  Optional human-readable description stored alongside the document metadata.
</ParamField>

## Loading URLs

To ingest a web page for a specific agent, trigger the knowledge load endpoint. Agent Manager scrapes the configured URLs for that agent and processes the content through the same ingestion pipeline as file uploads.

```bash theme={null}
curl -X POST http://localhost:8080/api/agents/{agentId}/knowledge/load \
  -H "Authorization: Bearer {token}"
```

**Response**

```json theme={null}
{
  "jobId": "job-f9a44e"
}
```

<Tip>
  You can also trigger URL ingestion directly from the **Agent detail view** in the UI without writing any code.
</Tip>

## Listing documents

`GET /api/knowledge` returns a paginated list of all ingested documents. Use the `status` field to track ingestion progress.

```bash theme={null}
curl http://localhost:8080/api/knowledge
```

**Response**

```json theme={null}
{
  "content": [
    {
      "id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
      "name": "company-policy.pdf",
      "contentType": "application/pdf",
      "status": "COMPLETED",
      "createdAt": "2026-05-06T10:00:00Z"
    },
    {
      "id": "b2c3d4e5-f6a7-8901-bcde-f12345678901",
      "name": "https://docs.example.com/api-reference",
      "contentType": "text/html",
      "status": "PROCESSING",
      "createdAt": "2026-05-06T10:05:00Z"
    }
  ],
  "totalElements": 2,
  "totalPages": 1
}
```

### Document status values

<CardGroup cols={3}>
  <Card title="PROCESSING" icon="loader">
    The document has been received and is being extracted, chunked, and embedded. It is not yet searchable.
  </Card>

  <Card title="COMPLETED" icon="circle-check">
    Ingestion succeeded. The document is fully indexed and available for semantic search.
  </Card>

  <Card title="FAILED" icon="circle-x">
    Ingestion encountered an error. Check `statusMessage` for details. URL-sourced documents can be retried; file uploads must be re-uploaded.
  </Card>
</CardGroup>

## Testing semantic search

Before connecting a knowledge base to an agent, you can test retrieval directly with `GET /api/knowledge/search`.

```bash theme={null}
curl "http://localhost:8080/api/knowledge/search?query=refund+policy+for+enterprise+customers"
```

The response is a list of `Document` objects ranked by semantic similarity to your query. Use this to verify that your documents are indexed correctly and that relevant content surfaces for representative queries.

## Deleting a document

`DELETE /api/knowledge/{id}` removes a document and all associated vector rows from storage. The operation is a cascading delete: both the document metadata and its vector chunks are permanently removed.

```bash theme={null}
curl -X DELETE http://localhost:8080/api/knowledge/a1b2c3d4-e5f6-7890-abcd-ef1234567890
```

Returns `204 No Content` on success.

<Warning>
  Deletion is irreversible. The document must be re-uploaded or re-ingested from its URL to restore it.
</Warning>

## End-to-end example

<Steps>
  <Step title="Upload your document">
    Submit a PDF or TXT file via `POST /api/knowledge/upload` and save the returned `documentId`.
  </Step>

  <Step title="Wait for ingestion to complete">
    Poll `GET /api/knowledge` and check the document's `status` field. It will transition from `PROCESSING` to `COMPLETED` (or `FAILED`) as ingestion runs.
  </Step>

  <Step title="Test retrieval">
    Run a test query with `GET /api/knowledge/search?query=...` to confirm relevant content is returned.
  </Step>

  <Step title="Run your agent">
    Submit a request to your agent. When the agent determines that retrieved context would help answer the query, it automatically calls `search_knowledge_base` and incorporates the results.
  </Step>
</Steps>
