Tokios API Endpoints: OpenAI & Anthropic Reference

All Tokios API requests are sent to https://api.tokios.com with your sk-tokios-... key as a Bearer token. Tokios exposes three endpoints covering the OpenAI chat completions, OpenAI responses, and Anthropic messages surfaces.

Base URL

https://api.tokios.com

Authentication

Every request requires the following headers:

Authorization: Bearer sk-tokios-YOUR_KEY_HERE
Content-Type: application/json

POST /v1/chat/completions

OpenAI-compatible chat completions endpoint. Send a list of messages and receive a completion from your registered model.

Request body

model

string

required

The registered deployment name (e.g. "gemma-tunnel").

messages

array

required

Array of message objects. Each object must have:

role — one of "system", "user", or "assistant"
content — the message text (string)

stream

boolean

default:"false"

Stream the response using server-sent events.

temperature

number

Sampling temperature between 0 and 2. Higher values produce more random output.

max_tokens

integer

Maximum number of tokens to generate.

Example request

curl https://api.tokios.com/v1/chat/completions \
  -H "Authorization: Bearer sk-tokios-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemma-tunnel",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "What is 2 + 2?"}
    ]
  }'

Example response

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1717000000,
  "model": "gemma-tunnel",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "2 + 2 equals 4."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 24,
    "completion_tokens": 9,
    "total_tokens": 33
  }
}

POST /v1/messages

Anthropic-compatible messages endpoint. Send a messages array and receive a response in Anthropic’s format.

Request body

model

string

required

The registered deployment name.

messages

array

required

Array of message objects. Each object must have:

role — one of "user" or "assistant"
content — the message text (string)

max_tokens

integer

required

Maximum number of tokens to generate.

system

string

System prompt to prepend to the conversation.

stream

boolean

default:"false"

Stream the response using server-sent events.

Example request

curl https://api.tokios.com/v1/messages \
  -H "Authorization: Bearer sk-tokios-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemma-tunnel",
    "max_tokens": 256,
    "messages": [
      {"role": "user", "content": "Explain async/await"}
    ]
  }'

Example response

{
  "id": "msg_abc123",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "Async/await is a syntax for handling asynchronous operations..."
    }
  ],
  "model": "gemma-tunnel",
  "stop_reason": "end_turn",
  "usage": {
    "input_tokens": 12,
    "output_tokens": 48
  }
}

POST /v1/responses

OpenAI responses API endpoint. Compatible with the newer OpenAI responses format.

Request body

model

string

required

The registered deployment name.

input

string | array

required

The input text or array of input items to generate a response for.

stream

boolean

default:"false"

Stream the response using server-sent events.

Example request

curl https://api.tokios.com/v1/responses \
  -H "Authorization: Bearer sk-tokios-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemma-tunnel",
    "input": "Write a haiku about tunnels."
  }'

Example response

{
  "id": "resp_abc123",
  "object": "response",
  "created": 1717000000,
  "model": "gemma-tunnel",
  "output": [
    {
      "type": "message",
      "role": "assistant",
      "content": [
        {
          "type": "output_text",
          "text": "Dark passage below,\nWater rushes unseen through—\nLight waits at the end."
        }
      ]
    }
  ],
  "usage": {
    "input_tokens": 8,
    "output_tokens": 18,
    "total_tokens": 26
  }
}

Error codes

HTTP Status	Meaning
401	Invalid or missing API key
403	API key doesn’t have access to the requested model
404	Model deployment not found
503	Connector offline — the tunnel is not active
429	Rate limit exceeded

A 503 response means the connector for that model deployment is not currently running. Start the connector and verify the tunnel is established before retrying.

​Authentication

​POST /v1/chat/completions

​Request body

​Example request

​Example response

​POST /v1/messages

​Request body

​Example request

​Example response

​POST /v1/responses

​Request body

​Example request

​Example response

​Error codes

Authentication

POST /v1/chat/completions

Request body

Example request

Example response

POST /v1/messages

Request body

Example request

Example response

POST /v1/responses

Request body

Example request

Example response

Error codes