ChatCompletionsRequestPayload
ChatCompletionsRequestPayload
Request to generate chat-completions.
JSON Example
{
"max_tokens": 64,
"messages": [
{
"content": "You are a helpful assistant.",
"role": "system"
},
{
"content": "Hello!",
"role": "user"
}
],
"model": "model-to-use",
"n": 1,
"stop": [
"END"
],
"stream": false,
"temperature": 0
}
number
temperature
Optional
Constraints:
minimum: 0
maximum: 2
default: 0
temperature
integer
n
Optional
Constraints:
minimum: 1
default: 1
n
array of
string
stop
Optional
Constraints:
minItems: 1
stop
integer
max_tokens
Optional
Constraints:
minimum: 1
max_tokens
boolean
stream
Optional
stream
stream_options
Optional
Options for streaming response. Only set this when you set stream: true.
string
model
Required
ID of the completions model to use.
integer
seed
Optional
Seed to propagate to the LLM for making repeated requests with the same seed as deterministic as possible. Note that this feature is in beta for most inference servers.
tool_choice
Optional
tool_choice