How is an LLM gateway different from calling providers directly?

Direct calls tie each app to one provider's SDK, spread raw API keys across environments, and leave cost and security controls to each team. A gateway centralizes keys, makes providers swappable behind one OpenAI-compatible API, and enforces spend and security policies in one place, before the call executes.

What makes an LLM gateway a governance platform?

Routing alone is a proxy. Governance adds who-can-do-what: access grants per team or key, budget reservation before execution, guardrails on prompts and outputs, and audit-ready records of every model decision. That combination is what compliance frameworks like the EU AI Act effectively require.

Does an LLM gateway add latency?

A well-built gateway adds milliseconds, dominated by policy checks that run in memory. That overhead is usually invisible next to model inference time, and it buys failover: when a provider degrades, the gateway reroutes instead of failing your feature.

Can I self-host an LLM gateway?

Yes. Odock is open source and self-hostable, so the gateway, its policies, and its usage records run inside your infrastructure. That matters for data residency, key custody, and compliance reviews where an external control plane is not acceptable.

odock.ai

Docs Book demo

Governed model request

LLM Gateway

Govern every model call your teams make

Q: What is an LLM gateway?

An LLM gateway is a single controlled endpoint between your applications and model providers. It authenticates callers with its own keys, applies access policies and guardrails, enforces budgets, routes requests to approved providers, and records usage, so provider access stops being scattered across teams and codebases.

Providers multiply, keys leak, and spend surprises arrive at the end of the month. Odock puts a governed LLM gateway between your apps and every model provider, enforcing access, guardrails, budgets, and audit records before any completion is generated.

Book a demo Read the LLM gateway guide

One governed path for every model call.

llm-gateway.request

LLM Gateway

>POST /v1/chat/completionsapp runtime

01Caller authverified

02Model grants...

03Guardrail scan...

04Budget hold...

05Route provider...

06Usage logged...

Providers

OpenAI, Anthropic, Azure, self-hosted

Controls

Keys, budgets, guardrails, audit

What is an LLM gateway?

One controlled endpoint between your apps and every model provider

An LLM gateway is a control plane between your applications and model providers like OpenAI, Anthropic, Azure, or self-hosted models. Instead of every team holding raw provider keys and calling APIs directly, model traffic flows through one endpoint that authenticates the caller, applies guardrails, reserves budget, routes to an approved provider, and records the outcome. It becomes a governance platform when it goes beyond routing: access grants per team, budgets held before the call fires, prompt injection and data leakage scanning, and audit-ready usage records. That is what Odock is built for, and the same plane also governs agent tool traffic through the MCP gateway.

Where teams use it

Multi-provider access

Give every team one OpenAI-compatible endpoint and virtual API keys. Add, swap, or self-host providers behind the gateway with zero changes to application code.

Cost control that blocks, not reports

Set budgets and quotas per key, team, or project. Odock reserves spend before the call executes, so a runaway agent stops at the limit instead of at the invoice.

Security and compliance at the gateway

Scan prompts for injection and data leakage, mask sensitive fields, enforce provider data policies, and produce audit records for every model decision.

Keep exploring

MCP Gateway Compare AI gateways LLM gateway guide

LLM request lifecycle

Every model call follows a governed path

No request reaches a provider until it passes auth, access, inspection, and cost controls. Every outcome is recorded with tokens, latency, and cost.

Authenticate

Validate the virtual API key.

Authorize

Confirm model and provider grants.

Inspect & enforce

Run guardrails on prompt and context.

Reserve spend

Check budgets and quotas before execution.

Route

Send to an approved provider with failover.

Record outcome

Log tokens, latency, status, and cost.

Blocked request exampleDenied by governance

{
  "apiKey": "vk_team_marketing",
  "model": "gpt-5.2",
  "method": "chat.completions",
  "reason": "budget_exceeded",
  "status": 402
}

Why it matters

Model calls are spend, data, and compliance events

Every completion moves money, may carry sensitive data, and may need to be explained to an auditor later. An LLM gateway gives platform, security, and finance teams one enforcement point for all three.

Control which models teams use

Allowlist approved providers and models per team or key, block deprecated or non-compliant models, and roll out new models without touching application code.

Enforce policy before execution

Authenticate every caller, scan prompts for injection and data leakage, and reject requests that fail security, budget, or compliance rules before any provider call is made.

Attribute every token

Tie every request to a key, team, and user with tokens, latency, and cost. Give finance chargeback data and give auditors the records they ask for.

What you configure

A complete LLM control surface, not just a proxy

Odock covers everything platform teams need to run LLM traffic in production: provider registration, model access grants, guardrails, pricing, budgets, and usage records your finance and security teams can actually read.

LLM views

Selected view

Register providers once and expose approved models through one OpenAI-compatible endpoint. Review API type, auth config, scope, and enabled status before any team can call them.

providers-&-models.llm

Live

openai-prod

Provider catalog

APIOPENAI_API

AuthBEARER

Scopeorg-wide

Access24 keys granted

anthropic-prod

Provider catalog

APIANTHROPIC_API

AuthBEARER

Scopeorg-wide

Access18 keys granted

azure-openai-eu

Manual setup

APIOPENAI_API

AuthBEARER

Scopeteam:eu-products

Access6 keys granted

bedrock-us

Provider catalog

APIBEDROCK_API

AuthOAUTH2

Scopeteam:platform

Access4 keys granted

mistral-eu

Provider catalog

APIOPENAI_API

AuthBEARER

Scopeteam:eu-products

Access5 keys granted

vllm-selfhosted

Manual setup

APIOPENAI_API

AuthNONE

Scopeteam:research

Access3 keys granted

One endpoint, two ways in

Your apps call Odock. Not the provider

Reach Odock through one unified OpenAI-style endpoint across every provider, or through each provider's native endpoint and SDK when you need provider-specific features. Either way, Odock authenticates the caller, confirms model access, runs guardrails, reserves budget, injects provider credentials, and records the outcome before the provider sees the request.

What the gateway handles

Unified OpenAI-style API across every provider, so existing SDKs work unchanged.
Native provider endpoints and SDKs when you need provider-specific features.
Virtual API key required before any model is reachable.
Provider credentials injected after governance checks pass.
Routing and failover across approved providers.

llm-unified-openai.py

UnifiedOpenAI SDK

Method

Language

1# Use Odock's unified endpoint through an OpenAI-compatible SDK
2import os
3from openai import OpenAI
4 
5client = OpenAI(
6    api_key=os.environ["ODOCK_API_KEY"],
7    base_url=os.environ.get("ODOCK_BASE_URL", "https://api.odock.ai/v1"),
8)
9 
10response = client.chat.completions.create(
11    model=os.environ.get("ODOCK_MODEL", "claude-sonnet-4-5"),
12    messages=[
13        {"role": "user", "content": "Explain budget enforcement."}
14    ],
15    temperature=0.2,
16    max_tokens=200,
17)
18 
19print(response.choices[0].message.content)

EU AI Act & compliance readiness

Evidence for every model decision

Compliance programs need answers to specific questions: which team used which model, under what policy, with what safeguards? Direct provider access can't answer those questions. Odock can.

Route every provider through one governed, auditable OpenAI-compatible endpoint
Enforce prompt injection and data leakage controls outside application code
Produce evidence for internal AI reviews, vendor assessments, and EU AI Act compliance programs
Keep LLM traffic and MCP tool traffic inside the same policy and operational boundary

FAQ

Questions teams ask about LLM gateways

These are the recurring questions that come up when teams move from direct provider access to a governed gateway layer.

One endpoint for every model, with governance built in

If your teams call LLMs in production, you need the same control you expect for any critical dependency. Odock governs model and tool traffic from a single request path.

Book a demo Read the LLM gateway guide

llm-unified-openai.py

# Use Odock's unified endpoint through an OpenAI-compatible SDKimport osfrom openai import OpenAI client = OpenAI(    api_key=os.environ["ODOCK_API_KEY"],    base_url=os.environ.get("ODOCK_BASE_URL", "https://api.odock.ai/v1"),) response = client.chat.completions.create(    model=os.environ.get("ODOCK_MODEL", "claude-sonnet-4-5"),    messages=[        {"role": "user", "content": "Explain budget enforcement."}    ],    temperature=0.2,    max_tokens=200,) print(response.choices[0].message.content)