ChatCompletionsRequestPayload

Request to generate chat-completions.

JSON Example

{
    "max_tokens": 64,
    "messages": [
        {
            "content": "You are a helpful assistant.",
            "role": "system"
        },
        {
            "content": "Hello!",
            "role": "user"
        }
    ],
    "model": "model-to-use",
    "n": 1,
    "stop": [
        "END"
    ],
    "stream": false,
    "temperature": 0
}

integer

seed

Optional

Seed to propagate to the LLM for making repeated requests with the same seed as deterministic as possible. Note that this feature is in beta for most inference servers.

number

temperature

Optional

Constraints: minimum: 0 maximum: 2 default: 0

temperature

integer

Optional

Constraints: minimum: 1 default: 1

array of string

stop

Optional

Constraints: minItems: 1

stop

integer

max_tokens

Optional

Constraints: minimum: 1

max_tokens

boolean

stream

Optional

stream

string

model

Required

ID of the completions model to use.

array of RequestMessage

messages

Required

Constraints: minItems: 1

messages

Parameter To

Create Chat Completion