AgentCompletionsRequestPayload
Request to generate completions from an agent.
{
"max_tokens": 64,
"model": "model-to-use",
"n": 1,
"prompt": "hello",
"stop": [
"END"
],
"temperature": 0
}
If true, the request creates a new agent session and the LLM interaction is stored as context for subsequent agent interactions when using the generated session.
Seed to propagate to the LLM for making repeated requests with the same seed as deterministic as possible. Note that this feature is in beta for most inference servers.
temperature
n
stop
max_tokens
stream
Optional ID of the model to use. If provided, it must match the model specified in the agent configuration. Unless the client needs to validate that the specified model is in use by the agent, do not specify this value and the API will choose the correct model. For compatibility with the OpenAI client SDK, this parameter may either be unset or an empty string may be used to indicate the use of the agent default configuration.
prompt
If set, use and extend the context stored in the given session for all LLM interactions.