Tokios: Private Local Models Behind One API Endpoint

Tokios connects your private local models — running on Ollama, llama.cpp, vLLM, or LM Studio — to a stable, cloud-reachable API endpoint at api.tokios.com. The connector on your machine dials out over a secure WebSocket; no inbound ports, no public IP, and no NAT configuration required. Point Claude Code, OpenAI Codex, or any OpenAI/Anthropic-compatible client at Tokios and your models become remotely accessible without ever leaving your hardware.

Quick Start

How It Works

Understand the outbound tunnel architecture and request flow.

Claude Code

Point Claude Code at your own models with two environment variables.

API Endpoints

Full reference for OpenAI- and Anthropic-compatible surfaces.

Get running in three steps

Create a free Tokios account, then download and run the connector next to your local model. The connector opens a single outbound WebSocket — nothing else is exposed on your network.

In the Tokios dashboard, pair the connector, register your model deployment, and generate a tenant-scoped API key (sk-tokios-...).

Point your coding agent at Tokios

Set ANTHROPIC_BASE_URL or OPENAI_BASE_URL to https://api.tokios.com and your API key. Your agent now calls your local model through the Tokios endpoint.

Tokios is free to try with no credit card required. The community tier gives you access to the full feature set so you can evaluate it with your own hardware.

Get Started with Tokios in Under 5 Minutes

⌘I

Quick Start

How It Works

Claude Code

API Endpoints

​Get running in three steps

Get running in three steps