ChatCompletionsRequestPayload
ChatCompletionsRequestPayload
Request to generate chat-completions.
JSON Example
{
"max_tokens": 64,
"messages": [
{
"content": "You are a helpful assistant.",
"role": "system"
},
{
"content": "Hello!",
"role": "user"
}
],
"model": "model-to-use",
"n": 1,
"stop": [
"END"
],
"stream": false,
"temperature": 0
}
integer
seed
Optional
Seed to propagate to the LLM for making repeated requests with the same seed as deterministic as possible. Note that this feature is in beta for most inference servers.
number
temperature
Optional
Constraints:
minimum: 0
maximum: 2
default: 0
temperature
integer
n
Optional
Constraints:
minimum: 1
default: 1
n
array of
string
stop
Optional
Constraints:
minItems: 1
stop
integer
max_tokens
Optional
Constraints:
minimum: 1
max_tokens
boolean
stream
Optional
stream
string
model
Required
ID of the completions model to use.