Documentation
SDKs
First-class Python, TypeScript, Go, and VS Code SDKs — plus drop-in compatibility with every existing OpenAI client.
3 min read · updated 2026-04-29
The cheapest "SDK" is the one you already use. Gateway-LLM speaks the OpenAI API natively, so the official OpenAI SDKs work as-is — just point them at the gateway:
from openai import OpenAI
client = OpenAI(
base_url="https://gateway.example.com/v1",
api_key="gw_virt_...",
)
That gets you 90% of what you need. The native SDKs below add ergonomics on top: typed routing decisions, spend records, virtual key management, streaming helpers.
Python — gateway-llm
Install:
pip install gateway-llm
Use:
from gateway_llm import GatewayClient
client = GatewayClient(
base_url="https://gateway.example.com",
api_key="gw_virt_...",
)
# Chat completions — same shape as openai-python
resp = client.chat.completions.create(
model="auto",
messages=[{"role": "user", "content": "Summarise this email in two lines"}],
)
print(resp.choices[0].message.content)
# Inspect the routing decision
print(resp.gateway.deployment) # 'openai/gpt-4o-mini'
print(resp.gateway.bucket) # 'simple'
print(resp.gateway.cost_usd) # 0.000099
The Python SDK also exposes a LangChain integration:
from gateway_llm.langchain import ChatGatewayLLM
llm = ChatGatewayLLM(model="auto", api_key="gw_virt_...")
TypeScript / JavaScript — @gateway-llm/sdk
Install:
npm install @gateway-llm/sdk
Use:
import { GatewayClient } from '@gateway-llm/sdk';
const client = new GatewayClient({
baseUrl: 'https://gateway.example.com',
apiKey: 'gw_virt_...',
});
const resp = await client.chat.completions.create({
model: 'auto',
messages: [{ role: 'user', content: 'Hi!' }],
});
// Typed gateway metadata
const { deployment, bucket, costUsd } = resp.gateway;
Streaming uses async iterators (works in Node 18+, Bun, Deno, and modern browsers):
const stream = await client.chat.completions.stream({
model: 'auto',
messages: [{ role: 'user', content: 'Tell me a story' }],
});
for await (const chunk of stream) {
process.stdout.write(chunk.delta);
}
Go — github.com/showmikb/gateway-llm/sdk/go
Install:
go get github.com/showmikb/gateway-llm/sdk/go@latest
Use:
package main
import (
"context"
"fmt"
gateway "github.com/showmikb/gateway-llm/sdk/go"
)
func main() {
client := gateway.New(gateway.Config{
BaseURL: "https://gateway.example.com",
APIKey: "gw_virt_...",
})
resp, err := client.Chat.Completions.Create(context.Background(), &gateway.ChatCompletionRequest{
Model: "auto",
Messages: []gateway.Message{{Role: "user", Content: "Hi!"}},
})
if err != nil { panic(err) }
fmt.Println(resp.Choices[0].Message.Content)
fmt.Println("decision:", resp.Gateway.Deployment)
}
Idiomatic context support, typed errors, and a streaming Recv() interface that doesn't allocate per chunk.
VS Code extension
The Gateway-LLM VS Code extension turns any IDE into a routing-aware playground. Install Gateway-LLM from the Marketplace, run Cmd+Shift+P → Gateway-LLM: Configure, drop in your gateway base URL and a virtual key, and now:
- Hover over any string literal that looks like an LLM prompt to see the cost & bucket the classifier would pick.
Cmd+Kruns the highlighted prompt against your gateway and shows the result inline.- A status-bar widget displays your team's monthly spend in real time.
It's a thin client over the same HTTP API — works against self-hosted and hosted gateways identically.
Curl / direct HTTP
When you want zero dependencies:
curl https://gateway.example.com/v1/chat/completions \
-H 'Authorization: Bearer gw_virt_...' \
-H 'Content-Type: application/json' \
-d '{
"model": "auto",
"messages": [{"role":"user","content":"Hi!"}]
}'
The response includes routing headers (x-gateway-decision, x-gateway-route-bucket, x-gateway-cost-usd) so you can route programmatically without parsing the body.
What's next
- Quickstart — get the server running first if you haven't.
- Virtual API keys — issue the key the SDK is going to use.
- Smart routing — what
model: 'auto'actually does.
Stuck or want a feature? Email the founders directly at mitshawtechnologies@gmail.com. We answer fast.