Documentation
Quickstart
Get Gateway-LLM running locally in under five minutes and fire your first OpenAI-compatible request.
3 min read · updated 2026-04-29
This page takes you from zero to a live gateway answering an OpenAI-compatible request in under five minutes. It assumes Docker, plus at least one provider API key.
Prerequisites
- Docker 24+ (Docker Desktop on macOS / Windows is fine)
- One of:
OPENAI_API_KEY,ANTHROPIC_API_KEY,GEMINI_API_KEY - (Optional) Go 1.23+ if you want to build from source instead of running the published image
1. Pull and run
The fastest path is docker compose:
git clone https://github.com/showmikb/gateway-llm.git
cd gateway-llm
cp .env.example .env
Open .env and set at least the master key plus one provider key:
GATEWAY_LLM_MASTER_KEY=sk-master-pick-anything-secret
OPENAI_API_KEY=sk-...
# Optional:
# ANTHROPIC_API_KEY=sk-ant-...
# GEMINI_API_KEY=AIza...
Then bring everything up:
docker compose up -d
This starts:
- The gateway backend on
:8080 - The admin UI on
:3000 - A local PostgreSQL (for spend logging + virtual keys)
- A local Redis (for rate limiting)
Tail the logs to confirm:
docker compose logs -f gateway-llm
When you see gateway-llm listening on :8080, you're ready.
2. Issue a virtual key
The master key from your .env is for admin operations only — your apps should call the gateway with a virtual key that you can rate-limit, scope, and revoke independently.
Open the admin UI at http://localhost:3000, sign in with the master key, and click Create virtual key. Or do it via the API:
curl -X POST http://localhost:8080/admin/keys \
-H "Authorization: Bearer $GATEWAY_LLM_MASTER_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "local-dev",
"models": ["gpt-4o-mini", "gpt-4o"],
"rpm": 60,
"monthly_budget_usd": 25
}'
The response includes a gw_virt_... token. That's the only secret your app needs.
3. Fire a request
Anything that speaks the OpenAI API works as-is — you just point it at the gateway:
curl
curl http://localhost:8080/v1/chat/completions \
-H "Authorization: Bearer gw_virt_..." \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o-mini",
"messages": [{"role": "user", "content": "In one sentence: what is an LLM gateway?"}]
}'
OpenAI Python SDK
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:8080/v1",
api_key="gw_virt_...",
)
resp = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Hi!"}],
)
print(resp.choices[0].message.content)
OpenAI TypeScript SDK
import OpenAI from 'openai';
const client = new OpenAI({
baseURL: 'http://localhost:8080/v1',
apiKey: 'gw_virt_...',
});
const resp = await client.chat.completions.create({
model: 'gpt-4o-mini',
messages: [{ role: 'user', content: 'Hi!' }],
});
4. Verify it routed correctly
Every successful request returns three custom response headers you can inspect:
| Header | What it tells you |
|---|---|
| x-gateway-decision | Which deployment was picked (openai/gpt-4o-mini, anthropic/...). |
| x-gateway-route-bucket | Smart-route classifier bucket: simple, medium, complex. |
| x-gateway-cost-usd | Computed cost of this request, against the gateway's pricing table. |
Or hit /metrics for the running aggregates:
curl http://localhost:8080/metrics | grep gatewayllm_
What's next
- Configuration —
config.yamlstructure, env vars, secrets. - Providers — adding more providers and routing across them.
- Smart routing — picking which model a prompt actually goes to.
- Virtual API keys — per-team rate limits and budgets.
- Observability — Prometheus, OpenTelemetry, Datadog, Langfuse.
Stuck or want a feature? Email the founders directly at mitshawtechnologies@gmail.com. We answer fast.