CompletionsRequestPayload

CompletionsRequestPayload
CompletionsRequestPayload

Request to generate completions.

JSON Example
{
    "max_tokens": 64,
    "model": "model-to-use",
    "n": 1,
    "prompt": "hello",
    "stop": [
        "END"
    ],
    "temperature": 0
}
number
temperature
Optional
Constraints: minimum: 0 maximum: 2 default: 0

temperature

integer
n
Optional
Constraints: minimum: 1 default: 1

n

array of string
stop
Optional
Constraints: minItems: 1

stop

integer
max_tokens
Optional
Constraints: minimum: 1

max_tokens

boolean
stream
Optional

stream

stream_options
Optional

Options for streaming response. Only set this when you set stream: true.

string
model
Required

ID of the completions model to use.

string
prompt
Required

prompt

integer
seed
Optional

Seed to propagate to the LLM for making repeated requests with the same seed as deterministic as possible. Note that this feature is in beta for most inference servers.

Parameter To