Documentation

Run Gateway-LLM in production.

Open-source. Single binary. OpenAI-compatible. These docs walk you from docker compose up to a multi-provider router with virtual keys, budgets, and full observability — in the order you’ll actually use them.

Getting started

Everything you need to run Gateway-LLM the first time.

Introduction3 min read
Gateway-LLM in 30 seconds — what it is, who runs it, and how the rest of the docs are organised.
Quickstart3 min read
Get Gateway-LLM running locally in under five minutes and fire your first OpenAI-compatible request.
Configuration6 min read
The shape of config.yaml, every environment variable, and how to manage secrets in production.

Core concepts

How smart routing, providers, and virtual keys fit together.

Providers4 min read
Add OpenAI, Anthropic, Google, Bedrock, Ollama, or your own self-hosted models behind a single OpenAI-compatible endpoint.
Smart routing4 min read
Three router strategies (round_robin, least_latency, classifier), how they pick deployments, and how they handle failover.
Virtual API keys3 min read
Issue per-team keys with rate limits, model allowlists, and hard budget caps. Revoke a key without rotating provider credentials.

Operate & extend

Wiring metrics, tracing, and the language SDKs into your stack.

Observability3 min read
Prometheus metrics, OpenTelemetry traces, Datadog, Honeycomb, Tempo, Langfuse, Splunk, and Slack — wire Gateway-LLM into whatever you already run.
SDKs3 min read
First-class Python, TypeScript, Go, and VS Code SDKs — plus drop-in compatibility with every existing OpenAI client.