ChatCompletionsRequestPayload

Request to generate chat-completions.

JSON Example

{
    "max_tokens": 64,
    "messages": [
        {
            "content": "You are a helpful assistant.",
            "role": "system"
        },
        {
            "content": "Hello!",
            "role": "user"
        }
    ],
    "model": "model-to-use",
    "n": 1,
    "stop": [
        "END"
    ],
    "stream": false,
    "temperature": 0
}

number

temperature

Optional

Constraints: minimum: 0 maximum: 2 default: 0

temperature

integer

Optional

Constraints: minimum: 1 default: 1

array of string

stop

Optional

Constraints: minItems: 1

stop

integer

max_tokens

Optional

Constraints: minimum: 1

max_tokens

boolean

stream

Optional

stream

stream_options

Optional

Options for streaming response. Only set this when you set stream: true.

string

model

Required

ID of the completions model to use.

array of ChatMessage

messages

Required

Constraints: minItems: 1

messages

array of ChatToolWithUC

tools

Optional

tools

integer

seed

Optional

Seed to propagate to the LLM for making repeated requests with the same seed as deterministic as possible. Note that this feature is in beta for most inference servers.

tool_choice

Optional

tool_choice

Parameter To

Create Chat Completion