Documentation

Quickstart

Get Gateway-LLM running locally in under five minutes and fire your first OpenAI-compatible request.

3 min read · updated 2026-04-29

This page takes you from zero to a live gateway answering an OpenAI-compatible request in under five minutes. It assumes Docker, plus at least one provider API key.

Prerequisites

Docker 24+ (Docker Desktop on macOS / Windows is fine)
One of: OPENAI_API_KEY, ANTHROPIC_API_KEY, GEMINI_API_KEY
(Optional) Go 1.23+ if you want to build from source instead of running the published image

1. Pull and run

The fastest path is docker compose:

git clone https://github.com/showmikb/gateway-llm.git
cd gateway-llm
cp .env.example .env

Open .env and set at least the master key plus one provider key:

GATEWAY_LLM_MASTER_KEY=sk-master-pick-anything-secret
OPENAI_API_KEY=sk-...
# Optional:
# ANTHROPIC_API_KEY=sk-ant-...
# GEMINI_API_KEY=AIza...

Then bring everything up:

docker compose up -d

This starts:

The gateway backend on :8080
The admin UI on :3000
A local PostgreSQL (for spend logging + virtual keys)
A local Redis (for rate limiting)

Tail the logs to confirm:

docker compose logs -f gateway-llm

When you see gateway-llm listening on :8080, you're ready.

2. Issue a virtual key

The master key from your .env is for admin operations only — your apps should call the gateway with a virtual key that you can rate-limit, scope, and revoke independently.

Open the admin UI at http://localhost:3000, sign in with the master key, and click Create virtual key. Or do it via the API:

curl -X POST http://localhost:8080/admin/keys \
  -H "Authorization: Bearer $GATEWAY_LLM_MASTER_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "local-dev",
    "models": ["gpt-4o-mini", "gpt-4o"],
    "rpm": 60,
    "monthly_budget_usd": 25
  }'

The response includes a gw_virt_... token. That's the only secret your app needs.

3. Fire a request

Anything that speaks the OpenAI API works as-is — you just point it at the gateway:

curl

curl http://localhost:8080/v1/chat/completions \
  -H "Authorization: Bearer gw_virt_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [{"role": "user", "content": "In one sentence: what is an LLM gateway?"}]
  }'

OpenAI Python SDK

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8080/v1",
    api_key="gw_virt_...",
)

resp = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hi!"}],
)
print(resp.choices[0].message.content)

OpenAI TypeScript SDK

import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'http://localhost:8080/v1',
  apiKey: 'gw_virt_...',
});

const resp = await client.chat.completions.create({
  model: 'gpt-4o-mini',
  messages: [{ role: 'user', content: 'Hi!' }],
});

4. Verify it routed correctly

Every successful request returns three custom response headers you can inspect:

| Header | What it tells you | |---|---| | x-gateway-decision | Which deployment was picked (openai/gpt-4o-mini, anthropic/...). | | x-gateway-route-bucket | Smart-route classifier bucket: simple, medium, complex. | | x-gateway-cost-usd | Computed cost of this request, against the gateway's pricing table. |

Or hit /metrics for the running aggregates:

curl http://localhost:8080/metrics | grep gatewayllm_

What's next

Configuration — config.yaml structure, env vars, secrets.
Providers — adding more providers and routing across them.
Smart routing — picking which model a prompt actually goes to.
Virtual API keys — per-team rate limits and budgets.
Observability — Prometheus, OpenTelemetry, Datadog, Langfuse.

Stuck or want a feature? Email the founders directly at mitshawtechnologies@gmail.com. We answer fast.