> ## Documentation Index
> Fetch the complete documentation index at: https://koreai-agent-management-platform-dev.mintlify.app/llms.txt
> Use this file to discover all available pages before exploring further.

# Ingest Data API

<Badge icon="arrow-left" color="gray">[Back to API List](/ai-for-service/apis/searchai/api-list)</Badge>

This API allows you to ingest and index data into the SearchAI application. You can ingest structured data as chunk fields, ingest an uploaded document, or perform incremental web crawling on existing web sources.

## Ingesting Documents

* To ingest content from a file, use the **Upload File API** to upload the file to the application.
* After uploading, include the `fileId` from the Upload File API response in the Ingest API to process the file content.
* Supported file formats: PDF, docx, ppt, or txt. Other file types will cause an error.

## Ingesting Structured Data

* To ingest structured data, add the content to the request body using the **Chunk Fields** listed in the table below.
* **File Structure**: The JSON file must follow a specific structure:
  * The **file name** is used as the `recordTitle`.
  * The JSON file must be an **array of objects**, where each object represents a chunk.
  * Each chunk's fields must correspond to the configured chunk fields.

## Crawling Web Pages

* This API supports incremental web crawling by adding content for an existing web source in Search AI.
  * The `sourceName` must match the Source Title for the web domain in Search AI.
  * Set `sourceType` to `"web"`.
  * Provide the URLs to crawl in the `urls` array under the `documents` field.
* The web crawl uses the crawl configuration set in Search AI for that source.
* Existing URLs are re-crawled; new URLs are crawled if the crawl configuration permits.

## API Specifications

| Field             | Value                                                |
| ----------------- | ---------------------------------------------------- |
| **Method**        | POST                                                 |
| **Endpoint**      | `https://{{host}}/api/public/bot/:botId/ingest-data` |
| **Content Type**  | `application/json`                                   |
| **Authorization** | `auth: {{JWT Token}}`                                |
| **API Scope**     | Ingest data                                          |

## Query Parameters

| Parameter | Required | Description                                                                                                             |
| --------- | -------- | ----------------------------------------------------------------------------------------------------------------------- |
| `host`    | Required | The environment URL. For example, `https://platform.example.org`.                                                       |
| Bot ID    | Required | Unique identifier of your application. To view it, go to **Dev Tools** under **App Settings** and check the API scopes. |

## Request Parameters

| Parameter    | Required | Description                                                                                                                                                                                                                                                               |
| ------------ | -------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `sourceName` | Yes      | If the given name does not exist, a new source is created automatically.                                                                                                                                                                                                  |
| `sourceType` | Yes      | Accepted values: `"json"` — to upload structured chunk fields via the request object (fileId is ignored); `"file"` — to upload documents using a fileId (chunk payload is ignored); `"web"` — to crawl web pages using provided URLs.                                     |
| `documents`  | Yes      | Depending on `sourceType`: for `json`, pass structured content directly with a `title` and `chunks` array; for `web`, pass a `urls` array of pages to crawl; for `file`, pass objects with a `fileId` and optional `fileName`, `permissions`, `category`, and `priority`. |

### Sample Request — Ingesting Chunks Directly

```json  theme={null}
{
  "sourceName": "Abc",
  "sourceType": "json",
  "documents": [
    {
      "title": "Cybersecurity",
      "chunks": [
        {
          "chunkText": "Cybersecurity is the practice of protecting systems, networks, and programs from digital attacks.",
          "recordUrl": "https://www.example.com/cybersecurity",
          "chunkTitle": "The Importance of Cybersecurity"
        }
      ]
    }
  ]
}
```

The fields inside the `chunks` object must correspond to the configured chunk fields. To view chunk fields, refer to the **Chunk Browser**.

### Sample Request — Incremental Web Crawl

```json  theme={null}
{
  "sourceName": "myWebDomain",
  "sourceType": "web",
  "documents": [
    {
      "urls": [
        "https://example.com/docs/",
        "https://example.com/product-guide/",
        "https://example.com/user-guide/"
      ]
    }
  ]
}
```

If a URL is already crawled, it is re-crawled. New URLs are crawled if the source's crawl configuration permits.

### Sample Request — Ingesting Content from Files

```json  theme={null}
{
  "sourceName": "Abc",
  "sourceType": "file",
  "documents": [
    {
      "fileId": "f12455",
      "permissions": {
        "allowedUsers": ["john@example.com", "jane@example.com"],
        "allowedGroups": ["Engineering", "Management"]
      }
    }
  ]
}
```

Use the Upload File API to upload the file and obtain the `fileId`. Pass that `fileId` here to ingest and index the file contents.


Built with [Mintlify](https://mintlify.com).