Create Assistant Chat Completion

Create chat completion using an agent.

This method creates a model response for the given chat conversation. The conversation is updated to reflect the agent settings and to use any inputs from enabled integrations, before it is forwarded to the LLM.

This method is compatible with the OpenAI endpoint for creating a chat completion. However, it infers additional inputs to the conversation as described above.

Request

URI

POST

https://{api_host}/api/v1/compatibility/openai/v1/assistants/{agent_id}/chat/completions

COPY

Path Parameters

string

agent_id

Required

agent_id

Request Body

AgentChatCompletionsRequestPayload of type(s) application/json

Required

Show optional properties

{
    "messages": [
        {
            "content": "You are a helpful assistant.",
            "role": "system"
        },
        {
            "content": "Hello!",
            "role": "user"
        }
    ]
}

{
    "create_session": false,
    "temperature": "number",
    "n": 0,
    "stop": [
        "string"
    ],
    "max_tokens": 0,
    "stream": false,
    "stream_options": {
        "include_usage": false
    },
    "model": "string",
    "messages": [
        {
            "role": "string",
            "content": "string",
            "tool_calls": [
                {
                    "id": "string",
                    "type": "string",
                    "function": {
                        "name": "string",
                        "arguments": "string"
                    }
                }
            ],
            "refusal": "string",
            "tool_call_id": "string"
        }
    ],
    "tools": [
        {
            "type": "string",
            "function": {
                "name": "string",
                "description": "string",
                "parameters": {
                    "properties": {
                        "properties": {
                            "type": "string",
                            "description": "string",
                            "enum": [
                                {}
                            ],
                            "items": {
                                "type": "string"
                            }
                        }
                    },
                    "type": "string",
                    "required": [
                        "string"
                    ],
                    "additionalProperties": false
                },
                "strict": false
            },
            "uc_function": {
                "name": "string"
            }
        }
    ],
    "seed": 0,
    "tool_choice": "string",
    "store_in_session": "string"
}

boolean

create_session

Optional

If true, the request creates a new agent session and the LLM interaction is stored as context for subsequent agent interactions when using the generated session.

number

temperature

Optional

Constraints: minimum: 0 maximum: 2 default: 0

temperature

integer

Optional

Constraints: minimum: 1 default: 1

array of string

stop

Optional

Constraints: minItems: 1

stop

integer

max_tokens

Optional

Constraints: minimum: 1

max_tokens

boolean

stream

Optional

stream

stream_options

Optional

Options for streaming response. Only set this when you set stream: true.

string

model

Optional

Optional ID of the model to use. If provided, it must match the model specified in the agent configuration. Unless the client needs to validate that the specified model is in use by the agent, do not specify this value and the API will choose the correct model. For compatibility with the OpenAI client SDK, this parameter may either be unset or an empty string may be used to indicate the use of the agent default configuration.

array of object

messages

Required

Constraints: minItems: 1

A chat request. content can be a string, or an array of content parts.

A content part is one of the following:

:py:class:TextContentPart <mlflow.types.chat.TextContentPart>
:py:class:ImageContentPart <mlflow.types.chat.ImageContentPart>
:py:class:AudioContentPart <mlflow.types.chat.AudioContentPart>

array of object

tools

Optional

A tool definition for the chat endpoint with Unity Catalog integration. The Gateway request accepts a special tool type 'uc_function' for Unity Catalog integration. https://mlflow.org/docs/latest/llms/deployments/uc_integration.html

integer

seed

Optional

Seed to propagate to the LLM for making repeated requests with the same seed as deterministic as possible. Note that this feature is in beta for most inference servers.

tool_choice

Optional

tool_choice

string As uuid As uuid

store_in_session

Optional

If set, use and extend the context stored in the given session for all LLM interactions.

Authentication

This operation uses the following authentication methods.

openId

Responses

200

Successful Response

{
    "choices": [
        {
            "finish_reason": "stop",
            "index": 0,
            "message": {
                "content": "Hello! I am an AI assistant",
                "role": "assistant"
            }
        }
    ],
    "created": 1700173217,
    "id": "3cdb958c-e4cc-4834-b52b-1d1a7f324714",
    "model": "llama-2-70b-chat-hf",
    "object": "chat.completion",
    "usage": {
        "completion_tokens": 8,
        "prompt_tokens": 10,
        "total_tokens": 18
    }
}

400

Unsupported role (e.g., "system") or unexpected model in chat completions message.

Operation doesn't return any data structure

404

No agent found with the specified ID.

Operation doesn't return any data structure

422

Validation Error

Returns HTTPValidationError of type(s) application/json

{
    "detail": [
        {
            "loc": [
                {}
            ],
            "msg": "string",
            "type": "string"
        }
    ]
}

array of object

detail

Optional

detail

Code Samples

COPY

                    curl -X POST -H 'Authorization: <value>' -H 'Content-Type: application/json' -d '{"messages":["object"]}' https://{api_host}/api/v1/compatibility/openai/v1/assistants/{agent_id}/chat/completions

On This Page

Description

Request

Path Parameters

Request Body

Authentication

Response

200 Response Body

Errors

Agents Operations

GET

List Agents