Skip to main content
All Tokios API requests are sent to https://api.tokios.com with your sk-tokios-... key as a Bearer token. Tokios exposes three endpoints covering the OpenAI chat completions, OpenAI responses, and Anthropic messages surfaces.
Base URL
https://api.tokios.com

Authentication

Every request requires the following headers:
Authorization: Bearer sk-tokios-YOUR_KEY_HERE
Content-Type: application/json

POST /v1/chat/completions

OpenAI-compatible chat completions endpoint. Send a list of messages and receive a completion from your registered model.

Request body

model
string
required
The registered deployment name (e.g. "gemma-tunnel").
messages
array
required
Array of message objects. Each object must have:
  • role — one of "system", "user", or "assistant"
  • content — the message text (string)
stream
boolean
default:"false"
Stream the response using server-sent events.
temperature
number
Sampling temperature between 0 and 2. Higher values produce more random output.
max_tokens
integer
Maximum number of tokens to generate.

Example request

curl https://api.tokios.com/v1/chat/completions \
  -H "Authorization: Bearer sk-tokios-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemma-tunnel",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "What is 2 + 2?"}
    ]
  }'

Example response

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1717000000,
  "model": "gemma-tunnel",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "2 + 2 equals 4."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 24,
    "completion_tokens": 9,
    "total_tokens": 33
  }
}

POST /v1/messages

Anthropic-compatible messages endpoint. Send a messages array and receive a response in Anthropic’s format.

Request body

model
string
required
The registered deployment name.
messages
array
required
Array of message objects. Each object must have:
  • role — one of "user" or "assistant"
  • content — the message text (string)
max_tokens
integer
required
Maximum number of tokens to generate.
system
string
System prompt to prepend to the conversation.
stream
boolean
default:"false"
Stream the response using server-sent events.

Example request

curl https://api.tokios.com/v1/messages \
  -H "Authorization: Bearer sk-tokios-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemma-tunnel",
    "max_tokens": 256,
    "messages": [
      {"role": "user", "content": "Explain async/await"}
    ]
  }'

Example response

{
  "id": "msg_abc123",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "Async/await is a syntax for handling asynchronous operations..."
    }
  ],
  "model": "gemma-tunnel",
  "stop_reason": "end_turn",
  "usage": {
    "input_tokens": 12,
    "output_tokens": 48
  }
}

POST /v1/responses

OpenAI responses API endpoint. Compatible with the newer OpenAI responses format.

Request body

model
string
required
The registered deployment name.
input
string | array
required
The input text or array of input items to generate a response for.
stream
boolean
default:"false"
Stream the response using server-sent events.

Example request

curl https://api.tokios.com/v1/responses \
  -H "Authorization: Bearer sk-tokios-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemma-tunnel",
    "input": "Write a haiku about tunnels."
  }'

Example response

{
  "id": "resp_abc123",
  "object": "response",
  "created": 1717000000,
  "model": "gemma-tunnel",
  "output": [
    {
      "type": "message",
      "role": "assistant",
      "content": [
        {
          "type": "output_text",
          "text": "Dark passage below,\nWater rushes unseen through—\nLight waits at the end."
        }
      ]
    }
  ],
  "usage": {
    "input_tokens": 8,
    "output_tokens": 18,
    "total_tokens": 26
  }
}

Error codes

HTTP StatusMeaning
401Invalid or missing API key
403API key doesn’t have access to the requested model
404Model deployment not found
503Connector offline — the tunnel is not active
429Rate limit exceeded
A 503 response means the connector for that model deployment is not currently running. Start the connector and verify the tunnel is established before retrying.