Skip to main content
Tokios connects your private local models — running on Ollama, llama.cpp, vLLM, or LM Studio — to a stable, cloud-reachable API endpoint at api.tokios.com. The connector on your machine dials out over a secure WebSocket; no inbound ports, no public IP, and no NAT configuration required. Point Claude Code, OpenAI Codex, or any OpenAI/Anthropic-compatible client at Tokios and your models become remotely accessible without ever leaving your hardware.

Quick Start

Sign up, install the connector, and make your first API call in minutes.

How It Works

Understand the outbound tunnel architecture and request flow.

Claude Code

Point Claude Code at your own models with two environment variables.

API Endpoints

Full reference for OpenAI- and Anthropic-compatible surfaces.

Get running in three steps

1

Sign up and install the connector

Create a free Tokios account, then download and run the connector next to your local model. The connector opens a single outbound WebSocket — nothing else is exposed on your network.
2

Register your model and get an API key

In the Tokios dashboard, pair the connector, register your model deployment, and generate a tenant-scoped API key (sk-tokios-...).
3

Point your coding agent at Tokios

Set ANTHROPIC_BASE_URL or OPENAI_BASE_URL to https://api.tokios.com and your API key. Your agent now calls your local model through the Tokios endpoint.
Tokios is free to try with no credit card required. The community tier gives you access to the full feature set so you can evaluate it with your own hardware.