Private AI Services API

Private AI Services API

Getting Started with Private AI Services REST APIs in 5 Minutes

To get you started quickly, let's dive into the necessary steps to enable you to begin calling Private AI Services (PAIS) APIs.

Step 1 - Authorization

The PAIS API uses OpenID Connect (OIDC) for authorizing requests from clients. More precisely, the API requires an OIDC client configured for Authorization Code With PKCE to be registered within the Identity Provider (IdP) that is configured for your PAIS instance.

To make requests, you will need to obtain an Access Token from the IdP server. In OIDC, an Access Token is a short-lived security credential that is added in the Authorization header of each API request. In addition to Access Tokens, most IdP servers support Refresh Tokens to renew the short-lived Access Tokens to allow for long-lived API interactions. For details, consult the documentation of the IdP server registered with your PAIS instance.

Acquiring an Access Token is typically an interactive process, and most API tools (e.g., Postman) support the OAuth 2.0 standard for obtaining (and renewing) security credentials automatically.

The required OIDC configuration for interacting with the IdP server configured for your PAIS instance can be retrieved from the PAIS API. For example, for an instance hosted at pais.local, you can access the configuration in JSON format at https://pais.local/env.json.

Automated/Non-interactive Clients

Depending on the type and configuration of your IdP, you may also be able to automate obtaining an Access Token (e.g., in non-interactive clients).

For instance, some IdP servers support the Resource Owner Password Flow grant for obtaining the required tokens, which you can automate. See example code below in Python:

import httpx
import httpx_auth  # available via `pip install httpx-auth`

oidc_auth = httpx_auth.OAuth2ResourceOwnerPasswordCredentials(
    token_url=<IdP token endpoint - see IdP configuration>,
    # The following configuration options are available via the PAIS
    # configuration at https://<fqdn>/env.json as described above
    client_id=<OIDC client-ID>,
    scope=<OIDC scopes>,
    # The OIDC credentials for which the Resource Owner Password Flow
    # grant was configured
    username=username,
    password=api_token,
)

pais_client = httpx.Client(auth=oidc_auth)
pais_client.get(...)

Note that the Resource Owner Password Flow is just one example grant. The types of grants that are available depend on the type of IdP and its configuration. Consult with your system administration for all available options.

Step 2 - Call your first API!

Once an Access Token has been obtained, you are ready to make your first API call. Provide the obtained token in the request headers (or use a library for automating this, as described above) and call one of the PAIS APIs.

For example, for a PAIS instance hosted at pais.local, you can retrieve the list of models available to you using this command:

curl 'https://pais.local/api/v1/compatibility/openai/v1/models' \
    --header 'Accept: application/json' \
    --header 'Authorization: Bearer <access-token>'

About Private AI Services API Programming

The PAIS system exposes a RESTful API that can be generally divided into two groups:

  • Configuration endpoints of data sources, knowledge bases, and indexes, and
  • Endpoints for interacting with AI models (for creating embeddings as well as chat- and text-completions) using an interface that is compatible with the OpenAI API, as well as endpoints for declaring agents.

For a detailed description of data sources, knowledge bases, indexes, agents and their interactions, please refer to the Private AI Services documentation, but you can find step-by-step instructions for implementing a complete workflow in the sections below.

Objects

For each type of object, the API contains endpoints for listing/retrieving, creating/updating, and deleting the object using the HTTP GET, POST, and DELETE methods, respectively. API request bodies and responses are using the JavaScript Object Notation (JSON) standard.

Every object in PAIS has a unique id field, which can be used for retrieving, updating, and deleting objects. The id is generated on object creation by the API and returned in a successful response. In addition to the object id, the API returns the type of an object via the object field and its creation time via the created_at field for any API returning an object. Timestamps in the PAIS API are always represented as integer values following the UNIX Timestamp standard (seconds elapsed since UNIX epoch).

Most objects in PAIS additionally contain a name and description field that help you organize your objects. Note that these are optional, and names are not enforced to be unique. Use the id field where uniqueness is required.

Cross-object references

When an object contains a reference to another object, the property holding the id of the referenced object is prefixed with the type of the object. For example, when retrieving details about an agent that is linked to an index, the ID of the index is accessible via index_id:

curl 'https://pais.local/api/v1/compatibility/openai/v1/agents/<agent-id>' \
    --header 'Accept: application/json' \
    --header 'Authorization: Bearer <access-token>'

{
    "name": "Demo Agent",
    "id": <agent-id>,
    "object": "agent",
    "index_id": <index-id>,
    ...
}

Updating objects

To update an object in PAIS, it is sufficient to send a POST request with the requested modifications to the object. That is, it is not required to send the entire object content to the API.

For example, to change the name as well as description of the agent, you can use the following request:

curl 'https://pais.local/api/v1/compatibility/openai/v1/agents/<agent-id>' \
    --header 'Content-Type: application/json' \
    --header 'Accept: application/json' \
    --header 'Authorization: Bearer <access-token>'
    --data '{
        "name": "my new agent name",
        "description": "an updated description of my agent"
    }'

Resetting optional values to be unset, use the JSON null value. For example, to configure the agent to not make use of an index any longer, use the following request:

curl 'https://pais.local/api/v1/compatibility/openai/v1/agents/<agent-id>' \
    --header 'Content-Type: application/json' \
    --header 'Accept: application/json' \
    --header 'Authorization: Bearer <access-token>'
    --data '{
        "index_id": null
    }'

Errors

The PAIS API returns errors using standard HTTP error-codes (e.g., HTTP-404 Not Found or HTTP-409 Conflict) when an API request fails. To help you identify the cause or source of the problem, each error response contains a detailed error description in the response body:

{
    "detail": [
        {
            "error_code": "ROUTE_TO_MODEL_NOT_FOUND",
            "loc": [
                "body",
                "model"
            ],
            "value": "some/model-id"
        }
    ]
}

Every error is associated with a specific error_code that identifies details on why the request failed, and loc and value specify which input in the request (if any) caused the specific problem.

Note that errors associated with parsing or routing the request may not contain an error_code in the response, if the API does not understand the request at all.

Listing and pagination

The API contains endpoints for listing and optionally filtering objects. The format of each object is the same whether you are retrieving a single object or requesting a list of objects. When listing objects, the response uses the following format:

curl 'https://pais.local/api/v1/compatibility/openai/v1/agents?limit=100' \
    --header 'Accept: application/json' \
    --header 'Authorization: Bearer <access-token>'
{
    "object": "list",
    "has_more": false,
    "data": [
        {
            "id": <agent-1-id>,
            "name": "my new agent name",
            ...
        },
        {
            "id": <agent-2-id>,
            "name": "my other agent",
            ...
        },
        ...
    ],
    "first_id": <agent-1-id>,
    "last_id": <agent-n-id>,
    ...
}

Object listings are always capped at a certain number of rows (configurable via the limit parameter), and it is your responsibility to issue requests until all objects have been retrieved. The API indicates whether more objects are available by setting the has_more field in the response.

The order in which objects are returned depends on the order parameter, which allows returning objects ordered by their creation timestamp using asc or desc (for ascending/descending created_at time, respectively).

Depending on the type of object, listings can be filtered by specifying optional filter criteria. Each endpoint documents which filters are supported.

If you need to know the total number of objects that match your filter (if any), provide the set_num_objects=true query argument:

curl 'https://pais.local/api/v1/compatibility/openai/v1/agents?set_num_objects=true' \
    --header 'Accept: application/json' \
    --header 'Authorization: Bearer <access-token>'
{
    "object": "list",
    "has_more": false,
    "num_objects": 2,
    "data": [
        {
            "id": <agent-1-id>,
            "name": "my new agent name",
            ...
        },
        {
            "id": <agent-2-id>,
            "name": "my other agent",
            ...
        },
        ...
    ],
    "first_id": <agent-1-id>,
    "last_id": <agent-2-id>
}

The response will contain a num_objects field to indicate the total number of available objects. The value is only set if explicitly requested, since calculating the number of total objects may incur a slowdown and is often not required by a requesting client.

For paginating across results, the API supports setting the limit and offset query arguments. While pagination via the offset parameter is convenient, you must be aware that you may skip objects if an object is deleted while you iterate.

To address this problem, the API offers an alternative form of pagination using after and before parameters: each list response contains the id of the first and last object in the list (using first_id and last_id, respectively). These values can be used in subsequent calls to obtain the next page of objects using the before and/or after query arguments (depending if paginating in ascending or descending order).

Set up an agent: End to end

Let's implement a typical workflow you may want to follow to set up your PAIS instance.

In this example, you will

  • create a data source using Google Drive,
  • create a knowledge base and link the Google Drive data source to it,
  • create an index for the knowledge base and trigger an indexing to populate the index with data,
  • search for relevant documents in the knowledge base index,
  • create an agent and link it to the knowledge base, and
  • interact with the agent to confirm it provides answers using context from the knowledge base.

Preparation

For this example, we assume you have already deployed

  • a PAIS instance using https://pais.local as URL,
  • an embedding model using routing name BAAI/bge-small-en-v1.5, and
  • an LLM/completion model using routing name meta-llama/Meta-Llama-3.1-8B-Instruct.

To make all commands as readable as possible, fetch an Access Token, as described in more detail above, and store it in an environment variable $token.

Models are available

To validate the models are declared as expected, list the available models:

curl 'https://pais.local/api/v1/compatibility/openai/v1/models' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token"

{
    "object": "list",
    "data": [
        {
            "id": "BAAI/bge-small-en-v1.5",
            "object": "model",
            "created": 1743425188,
            "owned_by": "",
            "model_type": "EMBEDDINGS",
            "model_engine": "INFINITY"
        },
        {
            "id": "meta-llama/Meta-Llama-3.1-8B-Instruct",
            "object": "model",
            "created": 1743176301,
            "owned_by": "",
            "model_type": "COMPLETIONS",
            "model_engine": "VLLM"
        }
    ]
}

Embedding model is functional

To validate the embedding model is working, let's create embeddings for the text "hello world":

curl 'https://pais.local/api/v1/compatibility/openai/v1/embeddings' \
    --header 'Content-Type: application/json' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token" \
    --data '{
        "input": "hello world",
        "model": "BAAI/bge-small-en-v1.5"
    }'

{
    "object": "list",
    "data": [
        {
            "object": "embedding",
            "embedding": [
                0.01519611943513155,
                ...
                0.02606809325516224
            ],
            "index": 0
        }
    ],
    "model": "BAAI/bge-small-en-v1.5",
    "usage": {
        "prompt_tokens": 11,
        "total_tokens": 11
    }
}

Completion model is functional

To validate the LLM is working, let's create a chat-completion:

curl 'https://pais.local/api/v1/compatibility/openai/v1/chat/completions' \
    --header 'Content-Type: application/json' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token" \
    --data '{
        "temperature": 0.01,
        "stream": false,
        "max_tokens": 10,
        "model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
        "messages": [
            {
                "role": "user",
                "content": "Are you there?"
            }
        ]
    }'

{
    "id": "chat-d5f3a14c26de4a8489247c20d18c7d5b",
    "object": "chat.completion",
    "created": 1743603564,
    "model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "Hello! Yes, I'm here. ...",
                "tool_calls": null
            },
            "finish_reason": "length"
        }
    ],
    "usage": {
        "prompt_tokens": 39,
        "completion_tokens": 10,
        "total_tokens": 49
    }
}

Workflow

With the environment fully configured, you can now create a RAG-enabled agent.

Configure a data source

First, let's ensure that the data source configuration you are about to use is valid and that the PAIS instance can reach the Google Drive APIs:

curl 'https://pais.local/api/v1/control/data-sources/test-connection' \
    --header 'Content-Type: application/json' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token" \
    --data-raw '{
        "origin_url": "https://drive.google.com/drive/u/0/folders/...",
        "type": "GOOGLE_DRIVE",
        "credentials": "{\"type\": ...}"
    }'

{
    "status": "CONNECTIVITY_RESULT_SUCCESS",
    "detail": null
}

The connectivity check is successful, so you can use these Google Drive settings for generating our data source:

curl 'https://pais.local/api/v1/control/data-sources' \
    --header 'Content-Type: application/json' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token" \
    --data-raw '{
        "origin_url": "https://drive.google.com/drive/u/0/folders/...",
        "name": "test-data-source",
        "description": "Example data source using data from Google Drive",
        "type": "GOOGLE_DRIVE",
        "credentials": "{\"type\": \"service_account\", ...}"
    }'

{
    "id": "cd918a65-c465-4d28-973b-9f8cc6ec62b9",
    "origin_url": "https://drive.google.com/drive/u/0/folders/...",
    "name": "test-data-source",
    "description": "Example data source using data from Google Drive",
    "type": "GOOGLE_DRIVE",
    "object": "data_source",
    "created_at": 1743603960,
    "last_updated_at": 1743603960,
    "options": null
}

The data source has been created and you can use ID cd918a65-c465-4d28-973b-9f8cc6ec62b9 for referring to it moving forward.

Optionally, you can validate if the created data source is configured correctly and the PAIS instance can reach the Google APIs:

curl 'https://pais.local/api/v1/control/data-sources/test-connection/cd918a65-c465-4d28-973b-9f8cc6ec62b9' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token"

{
    "status": "CONNECTIVITY_RESULT_SUCCESS",
    "detail": null
}

Like before, you can see that the data source is configured successfully.

Configure a knowledge base

First, create the knowledge base and configure it to use data sources:

curl 'https://pais.local/api/v1/control/knowledge-bases' \
    --header 'Content-Type: application/json' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token" \
    --data '{
        "index_refresh_policy": {
            "policy_type": "MANUAL"
        },
        "data_origin_type": "DATA_SOURCES",
        "name": "test-knowledge-base",
        "description": "Example knowledge base using data from Google Drive"
    }'

{
    "id": "d3a2cc76-0979-41db-b97a-ab133fd463a9",
    "name": "test-knowledge-base",
    "description": "Example knowledge base using data from Google Drive",
    "index_refresh_policy": {
        "policy_type": "MANUAL"
    },
    "data_origin_type": "DATA_SOURCES",
    "object": "knowledge_base",
    "created_at": 1743605493,
    "last_updated_at": 1743605493,
    "next_index_refresh_at": null
}

The knowledge base has been created and you can use ID d3a2cc76-0979-41db-b97a-ab133fd463a9 for referring to it moving forward.

Next, you link the data source to the knowledge base:

curl 'https://pais.local/api/v1/control/knowledge-bases/d3a2cc76-0979-41db-b97a-ab133fd463a9/data-sources' \
    --header 'Content-Type: application/json' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token" \
    --data '{
        "data_source_id": "cd918a65-c465-4d28-973b-9f8cc6ec62b9"
    }'

{
    "id": "cb9667e8-1fb9-4706-898b-881d51f46594",
    "state": "NOT_INDEXED",
    "data_source": {
        "id": "cd918a65-c465-4d28-973b-9f8cc6ec62b9",
        "origin_url": "https://drive.google.com/drive/u/0/folders/...",
        "name": "test-data-source",
        "description": "Example data source using data from Google Drive",
        "type": "GOOGLE_DRIVE",
        "object": "data_source",
        "created_at": 1743603960,
        "last_updated_at": 1743603960,
        "options": null
    }
}

As you can see, the data source is now linked to the knowledge base, but its status is NOT_INDEXED, as you have not indexed any files yet.

Build the index

First, you create an index for our knowledge base using our embedding model BAAI/bge-small-en-v1.5 and a simple sentence-splitter. We configure the index to contain text chunks of 100 tokens in size with no overlaps:

curl 'https://pais.local/api/v1/control/knowledge-bases/d3a2cc76-0979-41db-b97a-ab133fd463a9/indexes' \
    --header 'Content-Type: application/json' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token" \
    --data '{
        "name": "test-index",
        "description": "Example index using data from Google Drive",
        "embeddings_model_endpoint": "BAAI/bge-small-en-v1.5",
        "text_splitting": "SENTENCE",
        "chunk_size": 100,
        "chunk_overlap": 0
    }'

{
    "id": "32e18f23-c4fe-4e13-bcea-ec1f61808871",
    "name": "test-index",
    "description": "Example index using data from Google Drive",
    "embeddings_model_endpoint": "BAAI/bge-small-en-v1.5",
    "text_splitting": "SENTENCE",
    "chunk_size": 100,
    "chunk_overlap": 0,
    "knowledge_base_id": "d3a2cc76-0979-41db-b97a-ab133fd463a9",
    "object": "index",
    "created_at": 1743606389,
    "last_indexed_at": null,
    "last_indexed_by_id": null,
    "num_documents": 0,
    "status": "AVAILABLE",
    "status_errors": null
}

The index has been created and you can use ID 32e18f23-c4fe-4e13-bcea-ec1f61808871 for referring to it moving forward. The status shows AVAILABLE, indicating that there is no error and it is ready to be used.

Since you configured our knowledge base with a manual index refresh policy, indexings are not triggered automatically (this command returns an HTTP-404):

curl 'https://pais.local/api/v1/control/knowledge-bases/d3a2cc76-0979-41db-b97a-ab133fd463a9/indexes/32e18f23-c4fe-4e13-bcea-ec1f61808871/active-indexing' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token"

{
    "detail": [
        {
            "error_code": "NO_ACTIVE_INDEXING",
            "loc": null,
            "msg": null,
            "type": null,
            "value": null
        }
    ]
}

You can now explicitly trigger an indexing to populate the index with data:

curl --request POST 'https://pais.local/api/v1/control/knowledge-bases/d3a2cc76-0979-41db-b97a-ab133fd463a9/indexes/32e18f23-c4fe-4e13-bcea-ec1f61808871/indexings' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token"

{
    "id": "c967401f-02f9-487b-a3ad-46833a387d49",
    "state": "PENDING",
    "created_at": 1743606878,
    "knowledge_base_id": "d3a2cc76-0979-41db-b97a-ab133fd463a9",
    "index_id": "32e18f23-c4fe-4e13-bcea-ec1f61808871",
    "completed_at": null
}

Initially, the indexing will be in state PENDING but will quickly start running, and eventually you can see that the indexing completes with status DONE:

curl 'https://pais.local/api/v1/control/knowledge-bases/d3a2cc76-0979-41db-b97a-ab133fd463a9/indexes/32e18f23-c4fe-4e13-bcea-ec1f61808871/active-indexing' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token"

{
    "id": "c967401f-02f9-487b-a3ad-46833a387d49",
    "state": "DONE",
    "created_at": 1743606878,
    "knowledge_base_id": "d3a2cc76-0979-41db-b97a-ab133fd463a9",
    "index_id": "32e18f23-c4fe-4e13-bcea-ec1f61808871",
    "completed_at": 1743606885
}

Note that you could alternatively also have configured an index refresh policy that automatically and periodically triggers indexings (e.g., every hour, every day, etc.).

You can now list the documents that have been added to the index:

curl 'https://pais.local/api/v1/control/knowledge-bases/d3a2cc76-0979-41db-b97a-ab133fd463a9/indexes/32e18f23-c4fe-4e13-bcea-ec1f61808871/documents' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token"

{
    "object": "list",
    "has_more": false,
    "num_objects": null,
    "data": [
        {
            "origin_name": "test-file.txt",
            "origin_ref": "https://drive.google.com/file/d/...",
            "id": "ed6010b9-d51a-48c8-8a24-c9892bcac909",
            "object": "document",
            "state": "INDEXED",
            "media_type": "TXT",
            "knowledge_base_id": "d3a2cc76-0979-41db-b97a-ab133fd463a9",
            "index_id": "32e18f23-c4fe-4e13-bcea-ec1f61808871",
            "data_source_id": "cd918a65-c465-4d28-973b-9f8cc6ec62b9",
            "created_at": 1743606880,
            "last_embedded_at": 1743606882,
            "last_indexed_by_id": "c967401f-02f9-487b-a3ad-46833a387d49",
            "size_bytes": 963,
            "hash_sha256": "a17a5236fee2df65e8ecf0555a530a65d28af1b23d1439a919e1f17894e4d1c2",
            "reindexing_needed": false
        },
        ...
    ],
    "first_id": "ed6010b9-d51a-48c8-8a24-c9892bcac909",
    "last_id": "c2478660-9e5b-4930-8f0a-2af839ddb0fb"
}

You can also run queries against these documents:

curl 'https://pais.local/api/v1/control/knowledge-bases/d3a2cc76-0979-41db-b97a-ab133fd463a9/indexes/32e18f23-c4fe-4e13-bcea-ec1f61808871/search' \
    --header 'Accept: application/json' \
    --header 'Content-Type: application/json' \
    --header "Authorization: Bearer $token" \
    --data '{
        "text": "what is private AI?",
        "top_k": 2,
        "similarity_cutoff": 0
    }'

{
    "chunks": [
        {
            "origin_name": "private-ai.txt",
            "origin_ref": "https://drive.google.com/file/d/...",
            "document_id": "3343786c-a232-4986-9f89-cff373766de1",
            "score": 0.4686617851257324,
            "media_type": "TXT",
            "text": "..."
        },
        ...
    ]
}

Configure the agent

Finally, create your agent using our LLM meta-llama/Meta-Llama-3.1-8B-Instruct and link it to the index created above. Configure it to use a single document chunk (the most relevant one) from the index as context for answering questions, and provide instructions to use when interacting with the LLM:

curl 'https://pais.local/api/v1/compatibility/openai/v1/agents' \
    --header 'Content-Type: application/json' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token" \
    --data '{
        "name": "test-agent",
        "description": "Example agent using data from Google Drive",
        "model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
        "instructions": "You are a polite and helpful assistant.",
        "index_id": "32e18f23-c4fe-4e13-bcea-ec1f61808871",
        "index_top_n": 1,
        "index_similarity_cutoff": 0,
        "completion_role": "assistant",
        "session_max_length": "10000",
        "session_summarization_strategy": "delete_oldest",
        "index_reference_format": "structured",
        "chat_system_instruction_mode": "system-message"
    }'

{
    "name": "test-agent",
    "description": "Example agent using data from Google Drive",
    "instructions": "You are a polite and helpful assistant.",
    "session_max_ttl": null,
    "completion_role": "assistant",
    "index_id": "32e18f23-c4fe-4e13-bcea-ec1f61808871",
    "index_top_n": 1,
    "index_similarity_cutoff": 0.0,
    "index_reference_format": "structured",
    "index_reference_delimiter": null,
    "session_max_length": 10000,
    "session_summarization_strategy": "delete_oldest",
    "metadata": {},
    "id": "e52e4764-bdc4-4b16-a906-3c977d746d74",
    "object": "agent",
    "created_at": 1743607461,
    "model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
    "status": "AVAILABLE",
    "status_errors": null,
    "chat_system_instruction_mode": "system-message"
}

The agent has been created and you can use ID e52e4764-bdc4-4b16-a906-3c977d746d74 for referring to it moving forward. The status shows AVAILABLE, indicating that there is no error and the agent is ready to be used:

curl 'https://pais.local/api/v1/compatibility/openai/v1/agents/e52e4764-bdc4-4b16-a906-3c977d746d74/chat/completions' \
    --header 'Content-Type: application/json' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token" \
    --data '{
        "temperature": 0.01,
        "max_tokens": 100,
        "messages": [
            {
                "role": "user",
                "content": "What is private AI?"
            }
        ],
        "stream": false
    }'

{
    "id": "chat-882f92ca78b04ea7ae239aa6ea8751e2",
    "object": "chat.completion",
    "created": 1743607788,
    "model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "It appears to be a collection of keywords...",
                "tool_calls": null
            },
            "finish_reason": "length"
        }
    ],
    "usage": {
        "prompt_tokens": 159,
        "completion_tokens": 100,
        "total_tokens": 259
    },
    "session_id": null,
    "index_context_info": [
        {
            "text": "...",
            "metadata": {
                "document_id": "3343786c-a232-4986-9f89-cff373766de1",
                "score": 0.4686617851257324,
                "origin_name": "private-ai.txt",
                "media_type": "TXT",
                "origin_ref": "https://drive.google.com/file/d/..."
            }
        }
    ]
}

The response from the agent will contain the actual response from the LLM and include metadata about the document chunks that were retrieved from the index as context for the LLM.

How much context is retrieved (if any) or how it is returned to you is configurable as part of the agent. For example, you can update the agent to not return the context any longer:

curl 'https://pais.local/api/v1/compatibility/openai/v1/agents/e52e4764-bdc4-4b16-a906-3c977d746d74' \
    --header 'Content-Type: application/json' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token" \
    --data '{
        "index_reference_format": null
    }'

{
    "name": "test-agent",
    "description": "Example agent using data from Google Drive",
    "instructions": "You are a polite and helpful assistant.",
    "session_max_ttl": null,
    "completion_role": "assistant",
    "index_id": "32e18f23-c4fe-4e13-bcea-ec1f61808871",
    "index_top_n": 1,
    "index_similarity_cutoff": 0.0,
    "index_reference_format": null,
    "index_reference_delimiter": null,
    "session_max_length": 10000,
    "session_summarization_strategy": "delete_oldest",
    "metadata": {},
    "id": "e52e4764-bdc4-4b16-a906-3c977d746d74",
    "object": "agent",
    "created_at": 1743607461,
    "model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
    "status": "AVAILABLE",
    "status_errors": null,
    "chat_system_instruction_mode": "system-message"
}

Moving forward, the agent response will no longer contain the metadata when talking to the agent:

curl 'https://pais.local/api/v1/compatibility/openai/v1/agents/e52e4764-bdc4-4b16-a906-3c977d746d74/chat/completions' \
    --header 'Content-Type: application/json' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token" \
    --data '{
        "temperature": 0.01,
        "max_tokens": 100,
        "messages": [
            {
                "role": "user",
                "content": "What is private AI?"
            }
        ],
        "stream": false
    }'

{
    "session_id": null,
    "index_context_info": null,
    "id": "chat-23bc6fdd15724eeabe438de8f3b38c63",
    "object": "chat.completion",
    "created": 1743609282,
    "model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "It appears to be a collection of keywords...",
                "tool_calls": null
            },
            "finish_reason": "length"
        }
    ],
    "usage": {
        "prompt_tokens": 159,
        "completion_tokens": 100,
        "total_tokens": 259
    }
}

Last, you can update the agent once more to no longer use any information from the knowledge base. For the sake of the demo, you can also update the agent instructions to tell it to speak a different language:

curl 'https://pais.local/api/v1/compatibility/openai/v1/agents/e52e4764-bdc4-4b16-a906-3c977d746d74' \
    --header 'Content-Type: application/json' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token" \
    --data '{
        "index_id": null,
        "instructions": "You are a polite and helpful assistant who always answers in German."
    }'

{
    "name": "test-agent",
    "description": "Example agent using data from Google Drive",
    "instructions": "You are a polite and helpful assistant who always answers in German.",
    "session_max_ttl": null,
    "completion_role": "assistant",
    "index_id": null,
    "index_top_n": null,
    "index_similarity_cutoff": null,
    "index_reference_format": null,
    "index_reference_delimiter": null,
    "session_max_length": 10000,
    "session_summarization_strategy": "delete_oldest",
    "metadata": {},
    "id": "e52e4764-bdc4-4b16-a906-3c977d746d74",
    "object": "agent",
    "created_at": 1743607461,
    "model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
    "status": "AVAILABLE",
    "status_errors": null,
    "chat_system_instruction_mode": "system-message"
}

When you interact with the agent moving forward, it will no longer have any context from the data indexed from Google Drive; and it will answer in German:

curl 'https://pais.local/api/v1/compatibility/openai/v1/agents/e52e4764-bdc4-4b16-a906-3c977d746d74/chat/completions' \
    --header 'Content-Type: application/json' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token" \
    --data '{
        "temperature": 0.01,
        "max_tokens": 100,
        "messages": [
            {
                "role": "user",
                "content": "What is private AI?"
            }
        ],
        "stream": false
    }'
{
    "session_id": null,
    "index_context_info": null,
    "id": "chat-66fce20014e84b638341e95f40a48411",
    "object": "chat.completion",
    "created": 1743609492,
    "model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "Die private AI, auch bekannt als künstliche Intelligenz ...",
                "tool_calls": null
            },
            "finish_reason": "length"
        }
    ],
    "usage": {
        "prompt_tokens": 53,
        "completion_tokens": 100,
        "total_tokens": 153
    }
}

Clean-up

Finally, you can clean up the test objects we created in this workflow.

Delete the agent:

curl --request DELETE 'https://pais.local/api/v1/compatibility/openai/v1/agents/e52e4764-bdc4-4b16-a906-3c977d746d74' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token"

{
    "id": "e52e4764-bdc4-4b16-a906-3c977d746d74",
    "object": "agent.deleted",
    "deleted": true
}

Delete the knowledge base and index:

curl --request DELETE 'https://pais.local/api/v1/control/knowledge-bases/d3a2cc76-0979-41db-b97a-ab133fd463a9' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token"

{
    "id": "d3a2cc76-0979-41db-b97a-ab133fd463a9",
    "object": "knowledge_base.deleted",
    "deleted": true
}

Delete the data source:

curl --request DELETE 'https://pais.local/api/v1/control/data-sources/cd918a65-c465-4d28-973b-9f8cc6ec62b9' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token"

{
    "id": "cd918a65-c465-4d28-973b-9f8cc6ec62b9",
    "object": "data_source.deleted",
    "deleted": true
}

You can validate everything has been deleted - for example, by listing the available knowledge bases:

curl 'https://pais.local/api/v1/control/knowledge-bases?set_num_objects=true' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token"

{
    "object": "list",
    "has_more": false,
    "num_objects": 0,
    "data": [],
    "first_id": null,
    "last_id": null
}

And any interactions with the deleted agent will result in an error:

curl 'https://pais.local/api/v1/compatibility/openai/v1/agents/2f6ebb67-0980-4acd-b583-9e4816346444/chat/completions' \
    --header 'Content-Type: application/json' \
    --header 'Accept: application/json' \
    --header "Authorization: Bearer $token"
    --data '{
        "temperature": 0.01,
        "max_tokens": 100,
        "messages": [
            {
                "role": "user",
                "content": "What is private AI?"
            }
        ],
        "stream": false
    }'