Documentation
Providers
Add OpenAI, Anthropic, Google, Bedrock, Ollama, or your own self-hosted models behind a single OpenAI-compatible endpoint.
4 min read · updated 2026-04-29
A provider is anything that speaks an LLM API. Gateway-LLM adapts each provider's request and response shape to the OpenAI Chat Completions / Responses API surface so your application code never has to know which one served the call.
This page covers every supported provider, the env vars it needs, and the gotchas worth knowing before you ship.
Built-in providers
| Provider | provider: value | Auth env var | Streaming | Embeddings | Notes |
|---|---|---|---|---|---|
| OpenAI | openai | OPENAI_API_KEY | yes | yes | Reference implementation. |
| Anthropic | anthropic | ANTHROPIC_API_KEY | yes | no | Translated to/from Messages API. |
| Google Gemini | gemini | GEMINI_API_KEY | yes | yes | Free tier rate-limits aggressively. |
| AWS Bedrock | bedrock | AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY / AWS_REGION | yes | yes | IAM auth. Pay-per-request. |
| Mistral | mistral | MISTRAL_API_KEY | yes | yes | |
| Together | together | TOGETHER_API_KEY | yes | yes | OSS model hosting. |
| Groq | groq | GROQ_API_KEY | yes | no | Very fast Llama / Mixtral inference. |
| Ollama | ollama | (none, set OLLAMA_BASE_URL) | yes | yes | Self-hosted, local-first. |
Adding a provider
Each entry under model_list lists one or more deployments. Each deployment names exactly one provider. To add Anthropic alongside OpenAI:
model_list:
- model_alias: 'gpt-4o'
deployments:
- provider: openai
model: gpt-4o
api_key_env: OPENAI_API_KEY
- model_alias: 'claude-sonnet'
deployments:
- provider: anthropic
model: claude-sonnet-4-20250514
api_key_env: ANTHROPIC_API_KEY
Set ANTHROPIC_API_KEY in the gateway's environment and reload — the new alias is live.
One alias, many providers (failover + routing)
The pattern that gives you most of Gateway-LLM's value: a single alias backed by deployments across providers.
model_list:
- model_alias: 'smart'
deployments:
- provider: openai
model: gpt-4o
api_key_env: OPENAI_API_KEY
- provider: anthropic
model: claude-sonnet-4-20250514
api_key_env: ANTHROPIC_API_KEY
- provider: gemini
model: gemini-2.0-flash
api_key_env: GEMINI_API_KEY
Calls to model: 'smart' now flow through whichever deployment the router strategy picks. If OpenAI 5xxs, retries fall to Anthropic, then Gemini, transparently.
Provider-specific gotchas
OpenAI
- Streaming uses Server-Sent Events. The gateway re-frames provider chunks into the OpenAI delta format your client expects, so streaming "just works" even when the upstream is Anthropic or Gemini.
- The Responses API is bridged through Chat Completions for providers that don't support it natively.
Anthropic
- The Anthropic Messages API doesn't have a separate
systemrole at the message level — it lives at the request level. The gateway moves yourmessages[0]over for you when its role issystem. - Tool calls and structured outputs are translated bidirectionally.
Google Gemini
- The free Generative Language API has aggressive per-minute and per-day quotas. Move to a paid Google Cloud project before you ship anything serious.
gemini-2.0-flashis the cheap workhorse;gemini-2.5-prois the flagship. Pricing is per character, not per token, and the gateway converts internally.
AWS Bedrock
- Auth is IAM, not an API key. Mount AWS credentials the same way you would for any other AWS service (env vars, instance role, IRSA in EKS).
- Bedrock model IDs include the region, e.g.
anthropic.claude-3-5-sonnet-20241022-v2:0. Drop them inmodel:verbatim.
Ollama (self-hosted)
The gateway can route to a local Ollama instance — useful for offline dev, on-prem deployment, or a fully private flagship.
- model_alias: 'local-llama'
deployments:
- provider: ollama
model: llama3.3:70b
Set OLLAMA_BASE_URL if Ollama isn't on http://localhost:11434.
Custom / proprietary providers
If you've fine-tuned your own model and serve it behind an OpenAI-compatible endpoint (vLLM, TGI, your custom server), point the gateway at it directly:
- model_alias: 'house-model'
deployments:
- provider: openai_compatible
base_url: https://internal-llm.example.com/v1
model: house-model-v3
api_key_env: HOUSE_MODEL_KEY
provider: openai_compatible skips the translation layer — the gateway just forwards.
What's next
- Smart routing — once you have multiple providers, decide which prompts go where.
- Virtual API keys — restrict which providers a given team's key can reach.
- Multi-provider failover patterns — production playbook with config snippets.
Stuck or want a feature? Email the founders directly at mitshawtechnologies@gmail.com. We answer fast.