Endpoints
Chat Completions
Generate a chat response from a model.
POST
Request body
Model ID to use. See available models.
Array of message objects with
role and content.role:"system","user", or"assistant"content: The message text
Sampling temperature (0-2). Lower = more focused, higher = more creative.
Maximum tokens to generate in the response.
Stream response tokens as server-sent events.
Nucleus sampling parameter (0-1).