Create Assistant Chat Completion

Create Assistant Chat Completion
Create chat completion using an agent.

This method creates a model response for the given chat conversation. The conversation is updated to reflect the agent settings and to use any inputs from enabled integrations, before it is forwarded to the LLM.

This method is compatible with the OpenAI endpoint for creating a chat completion. However, it infers additional inputs to the conversation as described above.

Request
URI
POST
https://{api_host}/api/v1/compatibility/openai/v1/assistants/{agent_id}/chat/completions
COPY
Path Parameters
string
agent_id
Required

agent_id


Request Body
AgentChatCompletionsRequestPayload of type(s) application/json
Required

Show optional properties

{
    "messages": [
        {
            "content": "You are a helpful assistant.",
            "role": "system"
        },
        {
            "content": "Hello!",
            "role": "user"
        }
    ]
}
{
    "create_session": false,
    "temperature": "number",
    "n": 0,
    "stop": [
        "string"
    ],
    "max_tokens": 0,
    "stream": false,
    "stream_options": {
        "include_usage": false
    },
    "model": "string",
    "messages": [
        {
            "role": "string",
            "content": "string",
            "tool_calls": [
                {
                    "id": "string",
                    "type": "string",
                    "function": {
                        "name": "string",
                        "arguments": "string"
                    }
                }
            ],
            "refusal": "string",
            "tool_call_id": "string"
        }
    ],
    "tools": [
        {
            "type": "string",
            "function": {
                "name": "string",
                "description": "string",
                "parameters": {
                    "properties": {
                        "properties": {
                            "type": "string",
                            "description": "string",
                            "enum": [
                                {}
                            ],
                            "items": {
                                "type": "string"
                            }
                        }
                    },
                    "type": "string",
                    "required": [
                        "string"
                    ],
                    "additionalProperties": false
                },
                "strict": false
            },
            "uc_function": {
                "name": "string"
            }
        }
    ],
    "seed": 0,
    "tool_choice": "string",
    "store_in_session": "string"
}
boolean
create_session
Optional

If true, the request creates a new agent session and the LLM interaction is stored as context for subsequent agent interactions when using the generated session.

number
temperature
Optional
Constraints: minimum: 0 maximum: 2 default: 0

temperature

integer
n
Optional
Constraints: minimum: 1 default: 1

n

array of string
stop
Optional
Constraints: minItems: 1

stop

integer
max_tokens
Optional
Constraints: minimum: 1

max_tokens

boolean
stream
Optional

stream

stream_options
Optional

Options for streaming response. Only set this when you set stream: true.

string
model
Optional

Optional ID of the model to use. If provided, it must match the model specified in the agent configuration. Unless the client needs to validate that the specified model is in use by the agent, do not specify this value and the API will choose the correct model. For compatibility with the OpenAI client SDK, this parameter may either be unset or an empty string may be used to indicate the use of the agent default configuration.

array of object
messages
Required
Constraints: minItems: 1

A chat request. content can be a string, or an array of content parts.

A content part is one of the following:

  • :py:class:TextContentPart <mlflow.types.chat.TextContentPart>
  • :py:class:ImageContentPart <mlflow.types.chat.ImageContentPart>
  • :py:class:AudioContentPart <mlflow.types.chat.AudioContentPart>
array of object
tools
Optional

A tool definition for the chat endpoint with Unity Catalog integration. The Gateway request accepts a special tool type 'uc_function' for Unity Catalog integration. https://mlflow.org/docs/latest/llms/deployments/uc_integration.html

integer
seed
Optional

Seed to propagate to the LLM for making repeated requests with the same seed as deterministic as possible. Note that this feature is in beta for most inference servers.

tool_choice
Optional

tool_choice

string As uuid As uuid
store_in_session
Optional

If set, use and extend the context stored in the given session for all LLM interactions.

Authentication
This operation uses the following authentication methods.
Responses
200

Successful Response

{
    "choices": [
        {
            "finish_reason": "stop",
            "index": 0,
            "message": {
                "content": "Hello! I am an AI assistant",
                "role": "assistant"
            }
        }
    ],
    "created": 1700173217,
    "id": "3cdb958c-e4cc-4834-b52b-1d1a7f324714",
    "model": "llama-2-70b-chat-hf",
    "object": "chat.completion",
    "usage": {
        "completion_tokens": 8,
        "prompt_tokens": 10,
        "total_tokens": 18
    }
}

400

Unsupported role (e.g., "system") or unexpected model in chat completions message.

Operation doesn't return any data structure

404

No agent found with the specified ID.

Operation doesn't return any data structure

422

Validation Error

Returns HTTPValidationError of type(s) application/json
{
    "detail": [
        {
            "loc": [
                {}
            ],
            "msg": "string",
            "type": "string"
        }
    ]
}
array of object
detail
Optional

detail


Code Samples
COPY
                    curl -X POST -H 'Authorization: <value>' -H 'Content-Type: application/json' -d '{"messages":["object"]}' https://{api_host}/api/v1/compatibility/openai/v1/assistants/{agent_id}/chat/completions