Documentation | Chat › Completions

API Reference

Chat

ChatResource

Chat

Completions

ChatResource.CompletionsResource

Methods

client.chat.completions.models() ->

CompletionModelsResponse

get/v5/chat/completions/models

List Chat Completion Models

client.chat.completions.create() ->

CompletionCreateResponse

post/v5/chat/completions

Chat Completions

Parameters

messages:

Iterable

[Dict[str,

object

]]

openai standard message format

model: str

model specified as model_vendor/model, for example openai/gpt-4o

audio: Optional[Dict[str,

object

]]

Parameters for audio output. Required when audio output is requested with modalities: ['audio'].

frequency_penalty: Optional[float]

(maximum: 2, minimum: -2)

Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far.

function_call: Optional[Dict[str,

object

]]

Deprecated in favor of tool_choice. Controls which function is called by the model.

functions: Optional[

Iterable

[Dict[str,

object

]]]

Deprecated in favor of tools. A list of functions the model may generate JSON inputs for.

logit_bias: Optional[Dict[str, int]]

Modify the likelihood of specified tokens appearing in the completion. Maps tokens to bias values from -100 to 100.

logprobs: Optional[

bool

]

Whether to return log probabilities of the output tokens or not.

max_completion_tokens: Optional[int]

An upper bound for the number of tokens that can be generated, including visible output tokens and reasoning tokens.

max_tokens: Optional[int]

Deprecated in favor of max_completion_tokens. The maximum number of tokens to generate.

metadata: Optional[Dict[str, str]]

Developer-defined tags and values used for filtering completions in the dashboard.

modalities: Optional[

SequenceNotStr

[str]]

Output types that you would like the model to generate for this request.

n: Optional[int]

How many chat completion choices to generate for each input message.

parallel_tool_calls: Optional[

bool

]

Whether to enable parallel function calling during tool use.

prediction: Optional[Dict[str,

object

]]

Static predicted output content, such as the content of a text file being regenerated.

presence_penalty: Optional[float]

(maximum: 2, minimum: -2)

Number between -2.0 and 2.0. Positive values penalize tokens based on whether they appear in the text so far.

reasoning_effort: Optional[str]

For o1 models only. Constrains effort on reasoning. Values: low, medium, high.

response_format: Optional[Dict[str,

object

]]

An object specifying the format that the model must output.

seed: Optional[int]

If specified, system will attempt to sample deterministically for repeated requests with same seed.

stop: Optional[Union[str,

SequenceNotStr

[str]]]

Up to 4 sequences where the API will stop generating further tokens.

StopUnionMember0 = str

StopUnionMember1 =

SequenceNotStr

[str]

store: Optional[

bool

]

Whether to store the output for use in model distillation or evals products.

stream: Optional[Literal[false]]

If true, partial message deltas will be sent as server-sent events.

false

stream_options: Optional[Dict[str,

object

]]

Options for streaming response. Only set this when stream is true.

temperature: Optional[float]

(maximum: 2, minimum: 0)

What sampling temperature to use. Higher values make output more random, lower more focused.

tool_choice: Optional[Union[str, Dict[str,

object

]]]

Controls which tool is called by the model. Values: none, auto, required, or specific tool.

ToolChoiceUnionMember0 = str

ToolChoiceUnionMember1 = Dict[str,

object

]

tools: Optional[

Iterable

[Dict[str,

object

]]]

A list of tools the model may call. Currently, only functions are supported. Max 128 functions.

top_k: Optional[int]

Only sample from the top K options for each subsequent token

top_logprobs: Optional[int]

(maximum: 20, minimum: 0)

Number of most likely tokens to return at each position, with associated log probability.

top_p: Optional[float]

(maximum: 1, minimum: 0)

Alternative to temperature. Only tokens comprising top_p probability mass are considered.

Returns

CompletionCreateResponse =

CompletionCreateResponse

ChatCompletion

ChatCompletionChunk

Request example

from scale_gp_beta import SGPClient

client = SGPClient(
    api_key="My API Key",
)
completion = client.chat.completions.create(
    messages=[{
        "foo": "bar"
    }],
    model="model",
)
print(completion)

200Example

{
  "id": "id",
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "role": "assistant",
        "annotations": [
          {
            "type": "url_citation",
            "url_citation": {
              "end_index": 0,
              "start_index": 0,
              "title": "title",
              "url": "url"
            }
          }
        ],
        "audio": {
          "id": "id",
          "data": "data",
          "expires_at": 0,
          "transcript": "transcript"
        },
        "content": "content",
        "function_call": {
          "arguments": "arguments",
          "name": "name"
        },
        "refusal": "refusal",
        "tool_calls": [
          {
            "id": "id",
            "function": {
              "arguments": "arguments",
              "name": "name"
            },
            "type": "function"
          }
        ]
      },
      "logprobs": {
        "content": [
          {
            "token": "token",
            "logprob": 0,
            "top_logprobs": [
              {
                "token": "token",
                "logprob": 0,
                "bytes": [
                  0
                ]
              }
            ],
            "bytes": [
              0
            ]
          }
        ],
        "refusal": [
          {
            "token": "token",
            "logprob": 0,
            "top_logprobs": [
              {
                "token": "token",
                "logprob": 0,
                "bytes": [
                  0
                ]
              }
            ],
            "bytes": [
              0
            ]
          }
        ]
      }
    }
  ],
  "created": 0,
  "model": "model",
  "object": "chat.completion",
  "service_tier": "auto",
  "system_fingerprint": "system_fingerprint",
  "usage": {
    "completion_tokens": 0,
    "prompt_tokens": 0,
    "total_tokens": 0,
    "completion_tokens_details": {
      "accepted_prediction_tokens": 0,
      "audio_tokens": 0,
      "reasoning_tokens": 0,
      "rejected_prediction_tokens": 0
    },
    "prompt_tokens_details": {
      "audio_tokens": 0,
      "cached_tokens": 0
    }
  }
}

Domain types

class ChatCompletion: ...

class ChatCompletionChunk: ...

class ModelDefinition: ...