LLM Gateway Docs v2.0

Reference

REST API endpoints, config schema quick reference, changelog, and support resources.

REST API

All endpoints are served on PORT (default 8080). Admin endpoints require the RTD_ADMIN_TOKEN bearer token.

MethodPathAuthDescription
GET/healthnoneLiveness probe — returns 200 when gateway is running
GET/readynoneReadiness probe — returns 200 when all backends are healthy
GET/metricsnonePrometheus metrics in text exposition format
POST/v1/chat/completionskey/jwtOpenAI-compatible chat completions (streaming supported)
POST/v1/completionskey/jwtLegacy text completions endpoint
POST/v1/embeddingskey/jwtGenerate text embeddings
GET/v1/modelskey/jwtList all available models across backends
GET/admin/api-keysadminList all API keys (paginated)
POST/admin/api-keysadminCreate a new API key
DELETE/admin/api-keys/:idadminRevoke an API key by ID
GET/admin/backendsadminList backend status and health
POST/admin/backends/:name/resetadminReset circuit breaker for a backend
GET/admin/routesadminList all routing rules
PUT/admin/configadminHot-reload gateway config (experimental)

Config Schema Quick Reference

Top-level keys of gateway.yaml:

KeyTypeDescription
gatewayobjectGlobal gateway settings: auth, TLS, timeouts, observability
listenerslistPort/protocol bindings. Each entry has name, port, protocol, tls
backendslistLLM provider connections (or pools). name, type, config required
routeslistRequest matching rules. name, path, methods, backend required
policiesobjectCross-cutting policies: rateLimit, retry, circuitBreaker, contentPolicy

For the full JSON Schema, run rtd-gateway schema --output json or visit the API reference.

Changelog

v2.0January 2025
  • NewMulti-provider pool backend with latency and cost strategies
  • NewContent policy engine with PII detection and custom webhooks
  • NewOpenTelemetry distributed tracing support
  • NewOAuth 2.0 / OIDC browser flow and session management
  • NewToken-based rate limiting (per hour / per day)
  • ImprovedReduced cold-start overhead by 40% with lazy backend initialisation
  • ImprovedPrometheus metrics now include per-model cost_usd counter
  • FixedCircuit breaker state not persisting across gateway restarts
  • FixedStreaming responses occasionally dropped final chunk
v1.5September 2024
  • NewGoogle Gemini and Meta Llama provider adapters
  • NewJWKS caching with automatic key rotation
  • ImprovedRetry backoff now supports jitter to avoid thundering herd
  • FixedAzure OpenAI adapter did not forward deployment-level headers
v1.0May 2024
  • NewInitial release with OpenAI, Anthropic, and Azure OpenAI support
  • NewAPI key authentication and per-key rate limiting
  • NewRound-robin and weighted load balancing
  • NewPrometheus metrics and structured JSON logging

Support

Contact Us

Talk to our team about enterprise plans, MSAs, and custom deployments.

Get in touch

Pricing Plans

Compare Solo, Team, and Enterprise tiers and start a free trial.

View pricing

Security

Penetration test results, SOC 2 report, and responsible disclosure.

Security details

Status Page

Live uptime and incident history for all RTD services.

View status