Documentation

Providers

Add OpenAI, Anthropic, Google, Bedrock, Ollama, or your own self-hosted models behind a single OpenAI-compatible endpoint.

4 min read · updated 2026-04-29

A provider is anything that speaks an LLM API. Gateway-LLM adapts each provider's request and response shape to the OpenAI Chat Completions / Responses API surface so your application code never has to know which one served the call.

This page covers every supported provider, the env vars it needs, and the gotchas worth knowing before you ship.

Built-in providers

| Provider | provider: value | Auth env var | Streaming | Embeddings | Notes | |---|---|---|---|---|---| | OpenAI | openai | OPENAI_API_KEY | yes | yes | Reference implementation. | | Anthropic | anthropic | ANTHROPIC_API_KEY | yes | no | Translated to/from Messages API. | | Google Gemini | gemini | GEMINI_API_KEY | yes | yes | Free tier rate-limits aggressively. | | AWS Bedrock | bedrock | AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY / AWS_REGION | yes | yes | IAM auth. Pay-per-request. | | Mistral | mistral | MISTRAL_API_KEY | yes | yes | | | Together | together | TOGETHER_API_KEY | yes | yes | OSS model hosting. | | Groq | groq | GROQ_API_KEY | yes | no | Very fast Llama / Mixtral inference. | | Ollama | ollama | (none, set OLLAMA_BASE_URL) | yes | yes | Self-hosted, local-first. |

Adding a provider

Each entry under model_list lists one or more deployments. Each deployment names exactly one provider. To add Anthropic alongside OpenAI:

model_list:
  - model_alias: 'gpt-4o'
    deployments:
      - provider: openai
        model: gpt-4o
        api_key_env: OPENAI_API_KEY

  - model_alias: 'claude-sonnet'
    deployments:
      - provider: anthropic
        model: claude-sonnet-4-20250514
        api_key_env: ANTHROPIC_API_KEY

Set ANTHROPIC_API_KEY in the gateway's environment and reload — the new alias is live.

One alias, many providers (failover + routing)

The pattern that gives you most of Gateway-LLM's value: a single alias backed by deployments across providers.

model_list:
  - model_alias: 'smart'
    deployments:
      - provider: openai
        model: gpt-4o
        api_key_env: OPENAI_API_KEY
      - provider: anthropic
        model: claude-sonnet-4-20250514
        api_key_env: ANTHROPIC_API_KEY
      - provider: gemini
        model: gemini-2.0-flash
        api_key_env: GEMINI_API_KEY

Calls to model: 'smart' now flow through whichever deployment the router strategy picks. If OpenAI 5xxs, retries fall to Anthropic, then Gemini, transparently.

Provider-specific gotchas

OpenAI

  • Streaming uses Server-Sent Events. The gateway re-frames provider chunks into the OpenAI delta format your client expects, so streaming "just works" even when the upstream is Anthropic or Gemini.
  • The Responses API is bridged through Chat Completions for providers that don't support it natively.

Anthropic

  • The Anthropic Messages API doesn't have a separate system role at the message level — it lives at the request level. The gateway moves your messages[0] over for you when its role is system.
  • Tool calls and structured outputs are translated bidirectionally.

Google Gemini

  • The free Generative Language API has aggressive per-minute and per-day quotas. Move to a paid Google Cloud project before you ship anything serious.
  • gemini-2.0-flash is the cheap workhorse; gemini-2.5-pro is the flagship. Pricing is per character, not per token, and the gateway converts internally.

AWS Bedrock

  • Auth is IAM, not an API key. Mount AWS credentials the same way you would for any other AWS service (env vars, instance role, IRSA in EKS).
  • Bedrock model IDs include the region, e.g. anthropic.claude-3-5-sonnet-20241022-v2:0. Drop them in model: verbatim.

Ollama (self-hosted)

The gateway can route to a local Ollama instance — useful for offline dev, on-prem deployment, or a fully private flagship.

- model_alias: 'local-llama'
  deployments:
    - provider: ollama
      model: llama3.3:70b

Set OLLAMA_BASE_URL if Ollama isn't on http://localhost:11434.

Custom / proprietary providers

If you've fine-tuned your own model and serve it behind an OpenAI-compatible endpoint (vLLM, TGI, your custom server), point the gateway at it directly:

- model_alias: 'house-model'
  deployments:
    - provider: openai_compatible
      base_url: https://internal-llm.example.com/v1
      model: house-model-v3
      api_key_env: HOUSE_MODEL_KEY

provider: openai_compatible skips the translation layer — the gateway just forwards.

What's next


Stuck or want a feature? Email the founders directly at mitshawtechnologies@gmail.com. We answer fast.