Private AI Services API

Getting Started with Private AI Services REST APIs in 5 Minutes

To get you started quickly, let's dive into the necessary steps to enable you to begin calling Private AI Services (PAIS) APIs.

Step 1 - Authorization

The PAIS API uses OpenID Connect (OIDC) for authorizing requests from clients. More precisely, the API requires an OIDC client configured for Authorization Code With PKCE to be registered within the Identity Provider (IdP) that is configured for your PAIS instance.

To make requests, you will need to obtain an Access Token from the IdP server. In OIDC, an Access Token is a short-lived security credential that is added in the Authorization header of each API request. In addition to Access Tokens, most IdP servers support Refresh Tokens to renew the short-lived Access Tokens to allow for long-lived API interactions. For details, consult the documentation of the IdP server registered with your PAIS instance.

Acquiring an Access Token is typically an interactive process, and most API tools (e.g., Postman) support the OAuth 2.0 standard for obtaining (and renewing) security credentials automatically.

The required OIDC configuration for interacting with the IdP server configured for your PAIS instance can be retrieved from the PAIS API. For example, for an instance hosted at pais.local, you can access the configuration in JSON format at https://pais.local/env.json.

Automated/Non-interactive Clients

Depending on the type and configuration of your IdP, you may also be able to automate obtaining an Access Token (e.g., in non-interactive clients).

For instance, some IdP servers support the Resource Owner Password Flow grant for obtaining the required tokens, which you can automate. See example code below in Python:

import httpx
import httpx_auth  # available via `pip install httpx-auth`

oidc_auth = httpx_auth.OAuth2ResourceOwnerPasswordCredentials(
    token_url=<IdP token endpoint - see IdP configuration>,
    # The following configuration options are available via the PAIS
    # configuration at https://<fqdn>/env.json as described above
    client_id=<OIDC client-ID>,
    scope=<OIDC scopes>,
    # The OIDC credentials for which the Resource Owner Password Flow
    # grant was configured
    username=username,
    password=api_token,
)

pais_client = httpx.Client(auth=oidc_auth)
pais_client.get(...)

Note that the Resource Owner Password Flow is just one example grant. The types of grants that are available depend on the type of IdP and its configuration. Consult with your system administration for all available options.

Step 2 - Call your first API!

Once an Access Token has been obtained, you are ready to make your first API call. Provide the obtained token in the request headers (or use a library for automating this, as described above) and call one of the PAIS APIs.

For example, for a PAIS instance hosted at pais.local, you can retrieve the list of models available to you using this command:

curl 'https://pais.local/api/v1/compatibility/openai/v1/models' \
    --header 'Accept: application/json' \
    --header 'Authorization: Bearer <access-token>'

About Private AI Services API Programming

The PAIS system exposes a RESTful API that can be generally divided into two groups:

Configuration endpoints of data sources, knowledge bases, and indexes, and
Endpoints for interacting with AI models (for creating embeddings as well as chat- and text-completions) using an interface that is compatible with the OpenAI API, as well as endpoints for declaring agents.

For a detailed description of data sources, knowledge bases, indexes, agents and their interactions, please refer to the Private AI Services documentation, but you can find step-by-step instructions for implementing a complete workflow in the sections below.

Objects

For each type of object, the API contains endpoints for listing/retrieving, creating/updating, and deleting the object using the HTTP GET, POST, and DELETE methods, respectively. API request bodies and responses are using the JavaScript Object Notation (JSON) standard.

Every object in PAIS has a unique id field, which can be used for retrieving, updating, and deleting objects. The id is generated on object creation by the API and returned in a successful response. In addition to the object id, the API returns the type of an object via the object field and its creation time via the created_at field for any API returning an object. Timestamps in the PAIS API are always represented as integer values following the UNIX Timestamp standard (seconds elapsed since UNIX epoch).

Most objects in PAIS additionally contain a name and description field that help you organize your objects. Note that these are optional, and names are not enforced to be unique. Use the id field where uniqueness is required.

Cross-object references

When an object contains a reference to another object, the property holding the id of the referenced object is prefixed with the type of the object. For example, when retrieving details about an agent that is linked to an index, the ID of the index is accessible via index_id:

curl 'https://pais.local/api/v1/compatibility/openai/v1/agents/<agent-id>' \
    --header 'Accept: application/json' \
    --header 'Authorization: Bearer <access-token>'

{
    "name": "Demo Agent",
    "id": <agent-id>,
    "object": "agent",
    "index_id": <index-id>,
    ...
}

Updating objects

To update an object in PAIS, it is sufficient to send a POST request with the requested modifications to the object. That is, it is not required to send the entire object content to the API.

For example, to change the name as well as description of the agent, you can use the following request:

curl 'https://pais.local/api/v1/compatibility/openai/v1/agents/<agent-id>' \
    --header 'Content-Type: application/json' \
    --header 'Accept: application/json' \
    --header 'Authorization: Bearer <access-token>'
    --data '{
        "name": "my new agent name",
        "description": "an updated description of my agent"
    }'

Resetting optional values to be unset, use the JSON null value. For example, to configure the agent to not make use of an index any longer, use the following request:

curl 'https://pais.local/api/v1/compatibility/openai/v1/agents/<agent-id>' \
    --header 'Content-Type: application/json' \
    --header 'Accept: application/json' \
    --header 'Authorization: Bearer <access-token>'
    --data '{
        "index_id": null
    }'

Errors

The PAIS API returns errors using standard HTTP error-codes (e.g., HTTP-404 Not Found or HTTP-409 Conflict) when an API request fails. To help you identify the cause or source of the problem, each error response contains a detailed error description in the response body:

{
    "detail": [
        {
            "error_code": "ROUTE_TO_MODEL_NOT_FOUND",
            "loc": [
                "body",
                "model"
            ],
            "value": "some/model-id"
        }
    ]
}

Every error is associated with a specific error_code that identifies details on why the request failed, and loc and value specify which input in the request (if any) caused the specific problem.

Note that errors associated with parsing or routing the request may not contain an error_code in the response, if the API does not understand the request at all.

Listing and pagination

The API contains endpoints for listing and optionally filtering objects. The format of each object is the same whether you are retrieving a single object or requesting a list of objects. When listing objects, the response uses the following format:

curl 'https://pais.local/api/v1/compatibility/openai/v1/agents?limit=100' \
    --header 'Accept: application/json' \
    --header 'Authorization: Bearer <access-token>'
{
    "object": "list",
    "has_more": false,
    "data": [
        {
            "id": <agent-1-id>,
            "name": "my new agent name",
            ...
        },
        {
            "id": <agent-2-id>,
            "name": "my other agent",
            ...
        },
        ...
    ],
    "first_id": <agent-1-id>,
    "last_id": <agent-n-id>,
    ...
}

Object listings are always capped at a certain number of rows (configurable via the limit parameter), and it is your responsibility to issue requests until all objects have been retrieved. The API indicates whether more objects are available by setting the has_more field in the response.

The order in which objects are returned depends on the order parameter, which allows returning objects ordered by their creation timestamp using asc or desc (for ascending/descending created_at time, respectively).

Depending on the type of object, listings can be filtered by specifying optional filter criteria. Each endpoint documents which filters are supported.

If you need to know the total number of objects that match your filter (if any), provide the set_num_objects=true query argument:

curl 'https://pais.local/api/v1/compatibility/openai/v1/agents?set_num_objects=true' \
    --header 'Accept: application/json' \
    --header 'Authorization: Bearer <access-token>'
{
    "object": "list",
    "has_more": false,
    "num_objects": 2,
    "data": [
        {
            "id": <agent-1-id>,
            "name": "my new agent name",
            ...
        },
        {
            "id": <agent-2-id>,
            "name": "my other agent",
            ...
        },
        ...
    ],
    "first_id": <agent-1-id>,
    "last_id": <agent-2-id>
}

The response will contain a num_objects field to indicate the total number of available objects. The value is only set if explicitly requested, since calculating the number of total objects may incur a slowdown and is often not required by a requesting client.

For paginating across results, the API supports setting the limit and offset query arguments. While pagination via the offset parameter is convenient, you must be aware that you may skip objects if an object is deleted while you iterate.

To address this problem, the API offers an alternative form of pagination using after and before parameters: each list response contains the id of the first and last object in the list (using first_id and last_id, respectively). These values can be used in subsequent calls to obtain the next page of objects using the before and/or after query arguments (depending if paginating in ascending or descending order).

Set up an agent: End to end

Let's implement a typical workflow you may want to follow to set up your PAIS instance.

In this example, you will

create a data source using Google Drive,
create a knowledge base and link the Google Drive data source to it,
create an index for the knowledge base and trigger an indexing to populate the index with data,
search for relevant documents in the knowledge base index,
list available MCP tools and identify the REX search tool for your index,
create an agent and link it to the REX tool (which provides access to the knowledge base), and
interact with the agent to confirm it provides answers using context from the knowledge base.

Preparation

For this example, we assume you have already deployed

a PAIS instance using https://pais.local as URL,
an embedding model using routing name BAAI/bge-small-en-v1.5, and
an LLM/completion model using routing name meta-llama/Meta-Llama-3.1-8B-Instruct.

To make all commands as readable as possible, fetch an Access Token, as described in more detail above, and store it in an environment variable $token.

Models are available

To validate the models are declared as expected, list the available models:

curl 'https://pais.local/api/v1/compatibility/openai/v1/models' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token"

{
    "object": "list",
    "data": [
        {
            "id": "BAAI/bge-small-en-v1.5",
            "object": "model",
            "created": 1743425188,
            "owned_by": "",
            "model_type": "EMBEDDINGS",
            "model_engine": "INFINITY"
        },
        {
            "id": "meta-llama/Meta-Llama-3.1-8B-Instruct",
            "object": "model",
            "created": 1743176301,
            "owned_by": "",
            "model_type": "COMPLETIONS",
            "model_engine": "VLLM"
        }
    ]
}

Embedding model is functional

To validate the embedding model is working, let's create embeddings for the text "hello world":

curl 'https://pais.local/api/v1/compatibility/openai/v1/embeddings' \
    --header 'Content-Type: application/json' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token" \
    --data '{
        "input": "hello world",
        "model": "BAAI/bge-small-en-v1.5"
    }'

{
    "object": "list",
    "data": [
        {
            "object": "embedding",
            "embedding": [
                0.01519611943513155,
                ...
                0.02606809325516224
            ],
            "index": 0
        }
    ],
    "model": "BAAI/bge-small-en-v1.5",
    "usage": {
        "prompt_tokens": 11,
        "total_tokens": 11
    }
}

Completion model is functional

To validate the LLM is working, let's create a chat-completion:

curl 'https://pais.local/api/v1/compatibility/openai/v1/chat/completions' \
    --header 'Content-Type: application/json' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token" \
    --data '{
        "temperature": 0.01,
        "stream": false,
        "max_tokens": 10,
        "model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
        "messages": [
            {
                "role": "user",
                "content": "Are you there?"
            }
        ]
    }'

{
    "id": "chat-d5f3a14c26de4a8489247c20d18c7d5b",
    "object": "chat.completion",
    "created": 1743603564,
    "model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "Hello! Yes, I'm here. ...",
                "tool_calls": null
            },
            "finish_reason": "length"
        }
    ],
    "usage": {
        "prompt_tokens": 39,
        "completion_tokens": 10,
        "total_tokens": 49
    }
}

Workflow

With the environment fully configured, you can now create a RAG-enabled agent.

Note on REX Tools: You do not create a REX MCP tool directly. Instead, the system automatically creates and registers a REX MCP tool for you whenever you create an index in a knowledge base. You simply need to list the available tools to find the one corresponding to your index. The tool registration may take a few seconds after index creation, so if you do not see the tool immediately, wait 5-10 seconds and list the tools again. REX tools are automatically approved (is_approved: true) when created, unlike tools from external MCP servers which require explicit approval.

Configure a data source

First, let's ensure that the data source configuration you are about to use is valid and that the PAIS instance can reach the Google Drive APIs:

curl 'https://pais.local/api/v1/control/data-sources/test-connection' \
    --header 'Content-Type: application/json' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token" \
    --data-raw '{
        "origin_url": "https://drive.google.com/drive/u/0/folders/...",
        "type": "GOOGLE_DRIVE",
        "credentials": "{\"type\": ...}"
    }'

{
    "status": "CONNECTIVITY_RESULT_SUCCESS",
    "detail": null
}

The connectivity check is successful, so you can use these Google Drive settings for generating our data source:

curl 'https://pais.local/api/v1/control/data-sources' \
    --header 'Content-Type: application/json' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token" \
    --data-raw '{
        "origin_url": "https://drive.google.com/drive/u/0/folders/...",
        "name": "test-data-source",
        "description": "Example data source using data from Google Drive",
        "type": "GOOGLE_DRIVE",
        "credentials": "{\"type\": \"service_account\", ...}"
    }'

{
    "id": "cd918a65-c465-4d28-973b-9f8cc6ec62b9",
    "origin_url": "https://drive.google.com/drive/u/0/folders/...",
    "name": "test-data-source",
    "description": "Example data source using data from Google Drive",
    "type": "GOOGLE_DRIVE",
    "object": "data_source",
    "created_at": 1743603960,
    "last_updated_at": 1743603960,
    "options": null
}

The data source has been created and you can use ID cd918a65-c465-4d28-973b-9f8cc6ec62b9 for referring to it moving forward.

Optionally, you can validate if the created data source is configured correctly and the PAIS instance can reach the Google APIs:

curl 'https://pais.local/api/v1/control/data-sources/test-connection/cd918a65-c465-4d28-973b-9f8cc6ec62b9' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token"

{
    "status": "CONNECTIVITY_RESULT_SUCCESS",
    "detail": null
}

Like before, you can see that the data source is configured successfully.

Configure a knowledge base

First, create the knowledge base and configure it to use data sources:

curl 'https://pais.local/api/v1/control/knowledge-bases' \
    --header 'Content-Type: application/json' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token" \
    --data '{
        "index_refresh_policy": {
            "policy_type": "MANUAL"
        },
        "data_origin_type": "DATA_SOURCES",
        "name": "test-knowledge-base",
        "description": "Example knowledge base using data from Google Drive"
    }'

{
    "id": "d3a2cc76-0979-41db-b97a-ab133fd463a9",
    "name": "test-knowledge-base",
    "description": "Example knowledge base using data from Google Drive",
    "index_refresh_policy": {
        "policy_type": "MANUAL"
    },
    "data_origin_type": "DATA_SOURCES",
    "object": "knowledge_base",
    "created_at": 1743605493,
    "last_updated_at": 1743605493,
    "next_index_refresh_at": null
}

The knowledge base has been created and you can use ID d3a2cc76-0979-41db-b97a-ab133fd463a9 for referring to it moving forward.

Next, you link the data source to the knowledge base:

curl 'https://pais.local/api/v1/control/knowledge-bases/d3a2cc76-0979-41db-b97a-ab133fd463a9/data-sources' \
    --header 'Content-Type: application/json' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token" \
    --data '{
        "data_source_id": "cd918a65-c465-4d28-973b-9f8cc6ec62b9"
    }'

{
    "id": "cb9667e8-1fb9-4706-898b-881d51f46594",
    "state": "NOT_INDEXED",
    "data_source": {
        "id": "cd918a65-c465-4d28-973b-9f8cc6ec62b9",
        "origin_url": "https://drive.google.com/drive/u/0/folders/...",
        "name": "test-data-source",
        "description": "Example data source using data from Google Drive",
        "type": "GOOGLE_DRIVE",
        "object": "data_source",
        "created_at": 1743603960,
        "last_updated_at": 1743603960,
        "options": null
    }
}

As you can see, the data source is now linked to the knowledge base, but its status is NOT_INDEXED, as you have not indexed any files yet.

Build the index

First, you create an index for our knowledge base using our embedding model BAAI/bge-small-en-v1.5 and a simple sentence-splitter. We configure the index to contain text chunks of 100 tokens in size with no overlaps:

curl 'https://pais.local/api/v1/control/knowledge-bases/d3a2cc76-0979-41db-b97a-ab133fd463a9/indexes' \
    --header 'Content-Type: application/json' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token" \
    --data '{
        "name": "test-index",
        "description": "Example index using data from Google Drive",
        "embeddings_model_endpoint": "BAAI/bge-small-en-v1.5",
        "text_splitting": "SENTENCE",
        "chunk_size": 100,
        "chunk_overlap": 0
    }'

{
    "id": "32e18f23-c4fe-4e13-bcea-ec1f61808871",
    "name": "test-index",
    "description": "Example index using data from Google Drive",
    "embeddings_model_endpoint": "BAAI/bge-small-en-v1.5",
    "text_splitting": "SENTENCE",
    "chunk_size": 100,
    "chunk_overlap": 0,
    "knowledge_base_id": "d3a2cc76-0979-41db-b97a-ab133fd463a9",
    "object": "index",
    "created_at": 1743606389,
    "last_indexed_at": null,
    "last_indexed_by_id": null,
    "num_documents": 0,
    "status": "AVAILABLE",
    "status_errors": null
}

The index has been created and you can use ID 32e18f23-c4fe-4e13-bcea-ec1f61808871 for referring to it moving forward. The status shows AVAILABLE, indicating that there is no error and it is ready to be used.

Since you configured our knowledge base with a manual index refresh policy, indexings are not triggered automatically (this command returns an HTTP-404):

curl 'https://pais.local/api/v1/control/knowledge-bases/d3a2cc76-0979-41db-b97a-ab133fd463a9/indexes/32e18f23-c4fe-4e13-bcea-ec1f61808871/active-indexing' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token"

{
    "detail": [
        {
            "error_code": "NO_ACTIVE_INDEXING",
            "loc": null,
            "msg": null,
            "type": null,
            "value": null
        }
    ]
}

You can now explicitly trigger an indexing to populate the index with data:

curl --request POST 'https://pais.local/api/v1/control/knowledge-bases/d3a2cc76-0979-41db-b97a-ab133fd463a9/indexes/32e18f23-c4fe-4e13-bcea-ec1f61808871/indexings' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token"

{
    "id": "c967401f-02f9-487b-a3ad-46833a387d49",
    "state": "PENDING",
    "created_at": 1743606878,
    "knowledge_base_id": "d3a2cc76-0979-41db-b97a-ab133fd463a9",
    "index_id": "32e18f23-c4fe-4e13-bcea-ec1f61808871",
    "completed_at": null
}

Initially, the indexing will be in state PENDING but will quickly start running, and eventually you can see that the indexing completes with status DONE:

curl 'https://pais.local/api/v1/control/knowledge-bases/d3a2cc76-0979-41db-b97a-ab133fd463a9/indexes/32e18f23-c4fe-4e13-bcea-ec1f61808871/active-indexing' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token"

{
    "id": "c967401f-02f9-487b-a3ad-46833a387d49",
    "state": "DONE",
    "created_at": 1743606878,
    "knowledge_base_id": "d3a2cc76-0979-41db-b97a-ab133fd463a9",
    "index_id": "32e18f23-c4fe-4e13-bcea-ec1f61808871",
    "completed_at": 1743606885
}

Note that you could alternatively also have configured an index refresh policy that automatically and periodically triggers indexings (e.g., every hour, every day, etc.).

You can now list the documents that have been added to the index:

curl 'https://pais.local/api/v1/control/knowledge-bases/d3a2cc76-0979-41db-b97a-ab133fd463a9/indexes/32e18f23-c4fe-4e13-bcea-ec1f61808871/documents' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token"

{
    "object": "list",
    "has_more": false,
    "num_objects": null,
    "data": [
        {
            "origin_name": "test-file.txt",
            "origin_ref": "https://drive.google.com/file/d/...",
            "id": "ed6010b9-d51a-48c8-8a24-c9892bcac909",
            "object": "document",
            "state": "INDEXED",
            "media_type": "TXT",
            "knowledge_base_id": "d3a2cc76-0979-41db-b97a-ab133fd463a9",
            "index_id": "32e18f23-c4fe-4e13-bcea-ec1f61808871",
            "data_source_id": "cd918a65-c465-4d28-973b-9f8cc6ec62b9",
            "created_at": 1743606880,
            "last_embedded_at": 1743606882,
            "last_indexed_by_id": "c967401f-02f9-487b-a3ad-46833a387d49",
            "size_bytes": 963,
            "hash_sha256": "a17a5236fee2df65e8ecf0555a530a65d28af1b23d1439a919e1f17894e4d1c2",
            "reindexing_needed": false
        },
        ...
    ],
    "first_id": "ed6010b9-d51a-48c8-8a24-c9892bcac909",
    "last_id": "c2478660-9e5b-4930-8f0a-2af839ddb0fb"
}

You can also run queries against these documents:

curl 'https://pais.local/api/v1/control/knowledge-bases/d3a2cc76-0979-41db-b97a-ab133fd463a9/indexes/32e18f23-c4fe-4e13-bcea-ec1f61808871/search' \
    --header 'Accept: application/json' \
    --header 'Content-Type: application/json' \
    --header "Authorization: Bearer $token" \
    --data '{
        "text": "what is private AI?",
        "top_k": 2,
        "similarity_cutoff": 0
    }'

{
    "chunks": [
        {
            "origin_name": "private-ai.txt",
            "origin_ref": "https://drive.google.com/file/d/...",
            "document_id": "3343786c-a232-4986-9f89-cff373766de1",
            "score": 0.4686617851257324,
            "media_type": "TXT",
            "text": "..."
        },
        ...
    ]
}

Configure the agent

Finally, create your agent using our LLM meta-llama/Meta-Llama-3.1-8B-Instruct and link it to the knowledge base index via the REX tool.

First, list the available MCP tools to find the REX tool associated with your index. REX tools are served by the built-in MCP server. You can filter the results to show only tools from the built-in server by using the server=built-in query parameter.

curl 'https://pais.local/api/v1/control/mcp-servers/tools?server=built-in' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token"

{
    "object": "list",
    "has_more": false,
    "num_objects": null,
    "data": [
        {
            "id": "11009583-e69b-4a4d-9e63-26fb34e899ec",
            "name": "search_32e18f23c4fe4e13bceaec1f61808871",
            "description": "Search Knowledge Base Index 'test-index' for relevant content...",
            "mcp_server_id": null,
            "mcp_server_name": null,
            "is_approved": true,
            "object": "mcp_tool",
            "created_at": 1743606389
        },
        ...
    ]
}

Identify the tool that corresponds to your index. REX tools follow the naming pattern search_<index-id-hex> (for basic search) or search_with_filters_<index-id-hex> (for advanced search with metadata filtering). In this example, the search tool for index 32e18f23-c4fe-4e13-bcea-ec1f61808871 has ID 11009583-e69b-4a4d-9e63-26fb34e899ec and name search_32e18f23c4fe4e13bceaec1f61808871 (note that the hex representation of the UUID has no hyphens).

You can also list all tools without filtering to see both built-in REX tools and external MCP server tools by omitting the server parameter.

Note: You must use the actual id of the tool from the list response above:

If you provide a tool ID that does not exist, the API will return a 404 Not Found error when creating or updating the agent.
If the tool refers to an index and the REX service is unreachable during a completion request, you will receive a 502 Bad Gateway error with code INDEX_NOT_REACHABLE when calling the agent.

Search parameters (top_n and similarity_cutoff) are supported for each individual search tool using the tools array with PaisKnowledgeBaseIndexSearchToolLink. This allows you to configure different search parameters for each tool when using multiple knowledge base indexes, providing extra flexibility.

Create the agent and link it to the REX search tool using PAIS_KNOWLEDGE_BASE_INDEX_SEARCH_TOOL_LINK, which allows you to customize search parameters (top_n and similarity_cutoff) for each tool:

curl 'https://pais.local/api/v1/compatibility/openai/v1/agents' \
    --header 'Content-Type: application/json' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token" \
    --data '{
        "name": "test-agent",
        "description": "Example agent using data from Google Drive",
        "model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
        "instructions": "You are a polite and helpful assistant.",
        "tools": [
            {
                "link_type": "PAIS_KNOWLEDGE_BASE_INDEX_SEARCH_TOOL_LINK",
                "tool_id": "11009583-e69b-4a4d-9e63-26fb34e899ec",
                "top_n": 5,
                "similarity_cutoff": 0.7
            }
        ],
        "completion_role": "assistant",
        "session_max_length": 10000,
        "session_summarization_strategy": "delete_oldest",
        "chat_system_instruction_mode": "system-message",
        "index_reference_format": "structured"
    }'

The response will be:

{
    "name": "test-agent",
    "description": "Example agent using data from Google Drive",
    "instructions": "You are a polite and helpful assistant.",
    "tools": [
        {
            "link_type": "PAIS_KNOWLEDGE_BASE_INDEX_SEARCH_TOOL_LINK",
            "tool_id": "11009583-e69b-4a4d-9e63-26fb34e899ec",
            "top_n": 5,
            "similarity_cutoff": 0.7
        }
    ],
    "session_max_ttl": null,
    "completion_role": "assistant",
    "index_id": null,
    "index_top_n": null,
    "index_similarity_cutoff": null,
    "index_reference_format": "structured",
    "index_reference_delimiter": null,
    "session_max_length": 10000,
    "session_summarization_strategy": "delete_oldest",
    "metadata": {},
    "id": "e52e4764-bdc4-4b16-a906-3c977d746d74",
    "object": "agent",
    "created_at": 1743607461,
    "model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
    "status": "AVAILABLE",
    "status_errors": null,
    "chat_system_instruction_mode": "system-message"
}

The agent has been created with ID e52e4764-bdc4-4b16-a906-3c977d746d74. The status AVAILABLE indicates the agent is ready to use.

Note that agents can be linked to multiple tools simultaneously by including multiple entries in the tools array. This allows you to combine REX search tools with other MCP tools or use multiple knowledge base indexes in a single agent.

Using the Agent

You can now send completion requests to the agent. The agent will automatically retrieve relevant context from the knowledge base index.

How instructions work: The instructions field you specified during agent creation ("You are a polite and helpful assistant.") is used as a system prompt to guide the agent's behavior and tone in all responses. You do not need to pass instructions in completion requests - they are automatically applied:

curl 'https://pais.local/api/v1/compatibility/openai/v1/agents/e52e4764-bdc4-4b16-a906-3c977d746d74/chat/completions' \
    --header 'Content-Type: application/json' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token" \
    --data '{
        "temperature": 0.01,
        "max_tokens": 100,
        "messages": [
            {
                "role": "user",
                "content": "What is private AI?"
            }
        ],
        "stream": false
    }'

{
    "id": "chat-882f92ca78b04ea7ae239aa6ea8751e2",
    "object": "chat.completion",
    "created": 1743607788,
    "model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "It appears to be a collection of keywords...",
                "tool_calls": null
            },
            "finish_reason": "length"
        }
    ],
    "usage": {
        "prompt_tokens": 159,
        "completion_tokens": 100,
        "total_tokens": 259
    },
    "session_id": null,
    "index_context_info": [
        {
            "text": "...",
            "metadata": {
                "document_id": "3343786c-a232-4986-9f89-cff373766de1",
                "score": 0.4686617851257324,
                "origin_name": "private-ai.txt",
                "media_type": "TXT",
                "origin_ref": "https://drive.google.com/file/d/..."
            }
        }
    ]
}

The response from the agent will contain the actual response from the LLM and include metadata about the document chunks that were retrieved from the index as context for the LLM. Note that the returned metadata may be empty if no appropriate document chunks can be found in the knowledge base.

How much context is retrieved (if any) or how it is returned to you is configurable as part of the agent. For example, you can update the agent to not return the context any longer:

curl 'https://pais.local/api/v1/compatibility/openai/v1/agents/e52e4764-bdc4-4b16-a906-3c977d746d74' \
    --header 'Content-Type: application/json' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token" \
    --data '{
        "index_reference_format": null
    }'

{
    "name": "test-agent",
    "description": "Example agent using data from Google Drive",
    "instructions": "You are a polite and helpful assistant.",
    "session_max_ttl": null,
    "completion_role": "assistant",
    "index_id": "32e18f23-c4fe-4e13-bcea-ec1f61808871",
    "index_top_n": 1,
    "index_similarity_cutoff": 0.0,
    "index_reference_format": null,
    "index_reference_delimiter": null,
    "session_max_length": 10000,
    "session_summarization_strategy": "delete_oldest",
    "metadata": {},
    "id": "e52e4764-bdc4-4b16-a906-3c977d746d74",
    "object": "agent",
    "created_at": 1743607461,
    "model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
    "status": "AVAILABLE",
    "status_errors": null,
    "chat_system_instruction_mode": "system-message"
}

Moving forward, the agent response will no longer contain the metadata when talking to the agent:

curl 'https://pais.local/api/v1/compatibility/openai/v1/agents/e52e4764-bdc4-4b16-a906-3c977d746d74/chat/completions' \
    --header 'Content-Type: application/json' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token" \
    --data '{
        "temperature": 0.01,
        "max_tokens": 100,
        "messages": [
            {
                "role": "user",
                "content": "What is private AI?"
            }
        ],
        "stream": false
    }'

{
    "session_id": null,
    "index_context_info": null,
    "id": "chat-23bc6fdd15724eeabe438de8f3b38c63",
    "object": "chat.completion",
    "created": 1743609282,
    "model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "It appears to be a collection of keywords...",
                "tool_calls": null
            },
            "finish_reason": "length"
        }
    ],
    "usage": {
        "prompt_tokens": 159,
        "completion_tokens": 100,
        "total_tokens": 259
    }
}

Last, you can update the agent to no longer use any information from the knowledge base and change its behavior. To remove all tools from the agent, set tools to an empty array. You can also update other fields like instructions in the same request:

curl 'https://pais.local/api/v1/compatibility/openai/v1/agents/e52e4764-bdc4-4b16-a906-3c977d746d74' \
    --header 'Content-Type: application/json' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token" \
    --data '{
        "tools": [],
        "instructions": "You are a polite and helpful assistant who always answers in German."
    }'

{
    "name": "test-agent",
    "description": "Example agent using data from Google Drive",
    "instructions": "You are a polite and helpful assistant who always answers in German.",
    "tools": [],
    "session_max_ttl": null,
    "completion_role": "assistant",
    "index_id": null,
    "index_top_n": null,
    "index_similarity_cutoff": null,
    "index_reference_format": null,
    "index_reference_delimiter": null,
    "session_max_length": 10000,
    "session_summarization_strategy": "delete_oldest",
    "metadata": {},
    "id": "e52e4764-bdc4-4b16-a906-3c977d746d74",
    "object": "agent",
    "created_at": 1743607461,
    "model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
    "status": "AVAILABLE",
    "status_errors": null,
    "chat_system_instruction_mode": "system-message"
}

When you interact with the agent moving forward, it will no longer have any context from the data indexed from Google Drive, and it will answer in German:

curl 'https://pais.local/api/v1/compatibility/openai/v1/agents/e52e4764-bdc4-4b16-a906-3c977d746d74/chat/completions' \
    --header 'Content-Type: application/json' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token" \
    --data '{
        "temperature": 0.01,
        "max_tokens": 100,
        "messages": [
            {
                "role": "user",
                "content": "What is private AI?"
            }
        ],
        "stream": false
    }'
{
    "session_id": null,
    "index_context_info": null,
    "id": "chat-66fce20014e84b638341e95f40a48411",
    "object": "chat.completion",
    "created": 1743609492,
    "model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "Die private AI, auch bekannt als künstliche Intelligenz...",
                "tool_calls": null
            },
            "finish_reason": "length"
        }
    ],
    "usage": {
        "prompt_tokens": 53,
        "completion_tokens": 100,
        "total_tokens": 153
    }
}

Note that index_context_info is null, confirming that the agent is no longer using the knowledge base for context.

Clean-up

Finally, you can clean up the test objects we created in this workflow.

Delete the agent:

curl --request DELETE 'https://pais.local/api/v1/compatibility/openai/v1/agents/e52e4764-bdc4-4b16-a906-3c977d746d74' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token"

{
    "id": "e52e4764-bdc4-4b16-a906-3c977d746d74",
    "object": "agent.deleted",
    "deleted": true
}

Delete the knowledge base and index:

curl --request DELETE 'https://pais.local/api/v1/control/knowledge-bases/d3a2cc76-0979-41db-b97a-ab133fd463a9' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token"

{
    "id": "d3a2cc76-0979-41db-b97a-ab133fd463a9",
    "object": "knowledge_base.deleted",
    "deleted": true
}

Delete the data source:

curl --request DELETE 'https://pais.local/api/v1/control/data-sources/cd918a65-c465-4d28-973b-9f8cc6ec62b9' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token"

{
    "id": "cd918a65-c465-4d28-973b-9f8cc6ec62b9",
    "object": "data_source.deleted",
    "deleted": true
}

You can validate everything has been deleted - for example, by listing the available knowledge bases:

curl 'https://pais.local/api/v1/control/knowledge-bases?set_num_objects=true' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token"

{
    "object": "list",
    "has_more": false,
    "num_objects": 0,
    "data": [],
    "first_id": null,
    "last_id": null
}

And any interactions with the deleted agent will result in an error:

curl 'https://pais.local/api/v1/compatibility/openai/v1/agents/2f6ebb67-0980-4acd-b583-9e4816346444/chat/completions' \
    --header 'Content-Type: application/json' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token"
    --data '{
        "temperature": 0.01,
        "max_tokens": 100,
        "messages": [
            {
                "role": "user",
                "content": "What is private AI?"
            }
        ],
        "stream": false
    }'

Integrate external MCP servers with agents

You can also integrate external MCP servers with agents using GENERIC_MCP_TOOL_LINK. This example demonstrates how to register an external MCP server and link its tools to an agent.

Register an external MCP server

First, register your external MCP server by providing its URL:

curl 'https://pais.local/api/v1/control/mcp-servers' \
    --header 'Content-Type: application/json' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token" \
    --data '{
        "name": "weather-service",
        "url": "https://weather-mcp-server.example.com",
        "transport": "STREAMABLE_HTTP"
    }'

{
    "id": "a1b2c3d4-5678-90ab-cdef-1234567890ab",
    "status": "AVAILABLE",
    ...
}

The MCP server has been registered with ID a1b2c3d4-5678-90ab-cdef-1234567890ab. The status AVAILABLE indicates the server is running and ready to use.

List tools from the external server

Now list the tools provided by your external MCP server. Use the server parameter to filter tools from your specific server:

curl 'https://pais.local/api/v1/control/mcp-servers/tools?server=a1b2c3d4-5678-90ab-cdef-1234567890ab' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token"

{
    "data": [
        {
            "id": "f9e8d7c6-b5a4-3210-fedc-ba9876543210",
            "name": "get_current_weather",
            "is_approved": false,
            ...
        }
    ],
    ...
}

Note that Tools from external MCP servers must be explicitly approved before they can be used by agents.

Approve the tool

Before you can use an external tool with an agent, you must approve it:

curl --request POST 'https://pais.local/api/v1/control/mcp-servers/a1b2c3d4-5678-90ab-cdef-1234567890ab/tools/f9e8d7c6-b5a4-3210-fedc-ba9876543210/approval' \
    --header 'Content-Type: application/json' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token" \
    --data '{
        "is_approved": true
    }'

{
    "id": "f9e8d7c6-b5a4-3210-fedc-ba9876543210",
    "name": "get_current_weather",
    "is_approved": true,
    ...
}

The tool is now approved and ready to be linked to agents.

Create an agent with the external tool

Create an agent and link it to the external MCP tool using GENERIC_MCP_TOOL_LINK:

curl 'https://pais.local/api/v1/compatibility/openai/v1/agents' \
    --header 'Content-Type: application/json' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token" \
    --data '{
        "name": "weather-assistant",
        "description": "Assistant that can provide weather information",
        "model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
        "instructions": "You are a helpful assistant that provides weather information.",
        "tools": [
            {
                "link_type": "GENERIC_MCP_TOOL_LINK",
                "tool_id": "f9e8d7c6-b5a4-3210-fedc-ba9876543210"
            }
        ],
        ...
    }'

{
    "id": "b2c3d4e5-6789-01ab-cdef-234567890abc",
    "status": "AVAILABLE",
    ...
}

The agent has been created with ID b2c3d4e5-6789-01ab-cdef-234567890abc and is now linked to the external weather tool.

Use the agent

You can now interact with the agent. When you ask weather-related questions, the LLM may choose to invoke the linked tool if it finds it useful for answering your question:

curl 'https://pais.local/api/v1/compatibility/openai/v1/agents/b2c3d4e5-6789-01ab-cdef-234567890abc/chat/completions' \
    --header 'Content-Type: application/json' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token" \
    --data '{
        "messages": [
            {
                "role": "user",
                "content": "What is the weather like in San Francisco?"
            }
        ],
        ...
    }'

{
    "choices": [
        {
            "message": {
                "role": "assistant",
                "content": "Based on current data, San Francisco is experiencing partly cloudy conditions with a temperature of 62°F (17°C).",
                ...
            },
            ...
        }
    ],
    ...
}

The agent successfully used the external weather tool to answer the user's question.