Create Chat Completion

Create Chat Completion
Create a chat completion.

This method creates a model response for the given chat conversation, and it is compatible with the OpenAI endpoint for creating a chat completion.

Request
URI
POST
https://{api_host}/api/v1/compatibility/openai/v1/chat/completions
COPY
Request Body
ChatCompletionsRequestPayload of type(s) application/json
Required

Show optional properties

{
    "messages": [
        {
            "content": "You are a helpful assistant.",
            "role": "system"
        },
        {
            "content": "Hello!",
            "role": "user"
        }
    ],
    "model": "model-to-use"
}
{
    "temperature": "number",
    "n": 0,
    "stop": [
        "string"
    ],
    "max_tokens": 0,
    "stream": false,
    "stream_options": {
        "include_usage": false
    },
    "model": "string",
    "messages": [
        {
            "role": "string",
            "content": "string",
            "tool_calls": [
                {
                    "id": "string",
                    "type": "string",
                    "function": {
                        "name": "string",
                        "arguments": "string"
                    }
                }
            ],
            "refusal": "string",
            "tool_call_id": "string"
        }
    ],
    "tools": [
        {
            "type": "string",
            "function": {
                "name": "string",
                "description": "string",
                "parameters": {
                    "properties": {
                        "properties": {
                            "type": "string",
                            "description": "string",
                            "enum": [
                                {}
                            ],
                            "items": {
                                "type": "string"
                            }
                        }
                    },
                    "type": "string",
                    "required": [
                        "string"
                    ],
                    "additionalProperties": false
                },
                "strict": false
            },
            "uc_function": {
                "name": "string"
            }
        }
    ],
    "seed": 0,
    "tool_choice": "string"
}
number
temperature
Optional
Constraints: minimum: 0 maximum: 2 default: 0

temperature

integer
n
Optional
Constraints: minimum: 1 default: 1

n

array of string
stop
Optional
Constraints: minItems: 1

stop

integer
max_tokens
Optional
Constraints: minimum: 1

max_tokens

boolean
stream
Optional

stream

stream_options
Optional

Options for streaming response. Only set this when you set stream: true.

string
model
Required

ID of the completions model to use.

array of object
messages
Required
Constraints: minItems: 1

A chat request. content can be a string, or an array of content parts.

A content part is one of the following:

  • :py:class:TextContentPart <mlflow.types.chat.TextContentPart>
  • :py:class:ImageContentPart <mlflow.types.chat.ImageContentPart>
  • :py:class:AudioContentPart <mlflow.types.chat.AudioContentPart>
array of object
tools
Optional

A tool definition for the chat endpoint with Unity Catalog integration. The Gateway request accepts a special tool type 'uc_function' for Unity Catalog integration. https://mlflow.org/docs/latest/llms/deployments/uc_integration.html

integer
seed
Optional

Seed to propagate to the LLM for making repeated requests with the same seed as deterministic as possible. Note that this feature is in beta for most inference servers.

tool_choice
Optional

tool_choice

Authentication
This operation uses the following authentication methods.
Responses
200

Successful Response

{
    "choices": [
        {
            "finish_reason": "stop",
            "index": 0,
            "message": {
                "content": "Hello! I am an AI assistant",
                "role": "assistant"
            }
        }
    ],
    "created": 1700173217,
    "id": "3cdb958c-e4cc-4834-b52b-1d1a7f324714",
    "model": "llama-2-70b-chat-hf",
    "object": "chat.completion",
    "usage": {
        "completion_tokens": 8,
        "prompt_tokens": 10,
        "total_tokens": 18
    }
}

400

Invalid model endpoint specified or model endpoint not ready.

Operation doesn't return any data structure

404

Unknown model endpoint requested.

Operation doesn't return any data structure

422

Validation Error

Returns HTTPValidationError of type(s) application/json
{
    "detail": [
        {
            "loc": [
                {}
            ],
            "msg": "string",
            "type": "string"
        }
    ]
}
array of object
detail
Optional

detail


Code Samples
COPY
                    curl -X POST -H 'Authorization: <value>' -H 'Content-Type: application/json' -d '{"model":"string","messages":["object"]}' https://{api_host}/api/v1/compatibility/openai/v1/chat/completions