LLM Gateway Docs v2.0

Reference

REST API endpoints, config schema quick reference, changelog, and support resources.

api config changelog support

REST API

All endpoints are served on PORT (default 8080). Admin endpoints require the RTD_ADMIN_TOKEN bearer token.

Method	Path	Auth	Description
GET	/health	none	Liveness probe — returns 200 when gateway is running
GET	/ready	none	Readiness probe — returns 200 when all backends are healthy
GET	/metrics	none	Prometheus metrics in text exposition format
POST	/v1/chat/completions	key/jwt	OpenAI-compatible chat completions (streaming supported)
POST	/v1/completions	key/jwt	Legacy text completions endpoint
POST	/v1/embeddings	key/jwt	Generate text embeddings
GET	/v1/models	key/jwt	List all available models across backends
GET	/admin/api-keys	admin	List all API keys (paginated)
POST	/admin/api-keys	admin	Create a new API key
DELETE	/admin/api-keys/:id	admin	Revoke an API key by ID
GET	/admin/backends	admin	List backend status and health
POST	/admin/backends/:name/reset	admin	Reset circuit breaker for a backend
GET	/admin/routes	admin	List all routing rules
PUT	/admin/config	admin	Hot-reload gateway config (experimental)

Config Schema Quick Reference

Top-level keys of gateway.yaml:

Key	Type	Description
gateway	object	Global gateway settings: auth, TLS, timeouts, observability
listeners	list	Port/protocol bindings. Each entry has name, port, protocol, tls
backends	list	LLM provider connections (or pools). name, type, config required
routes	list	Request matching rules. name, path, methods, backend required
policies	object	Cross-cutting policies: rateLimit, retry, circuitBreaker, contentPolicy

For the full JSON Schema, run rtd-gateway schema --output json or visit the API reference.

Changelog

v2.0January 2025

NewMulti-provider pool backend with latency and cost strategies
NewContent policy engine with PII detection and custom webhooks
NewOpenTelemetry distributed tracing support
NewOAuth 2.0 / OIDC browser flow and session management
NewToken-based rate limiting (per hour / per day)
ImprovedReduced cold-start overhead by 40% with lazy backend initialisation
ImprovedPrometheus metrics now include per-model cost_usd counter
FixedCircuit breaker state not persisting across gateway restarts
FixedStreaming responses occasionally dropped final chunk

v1.5September 2024

NewGoogle Gemini and Meta Llama provider adapters
NewJWKS caching with automatic key rotation
ImprovedRetry backoff now supports jitter to avoid thundering herd
FixedAzure OpenAI adapter did not forward deployment-level headers

v1.0May 2024

NewInitial release with OpenAI, Anthropic, and Azure OpenAI support
NewAPI key authentication and per-key rate limiting
NewRound-robin and weighted load balancing
NewPrometheus metrics and structured JSON logging

Support

Contact Us

Talk to our team about enterprise plans, MSAs, and custom deployments.

Get in touch →

Pricing Plans

Compare Solo, Team, and Enterprise tiers and start a free trial.

View pricing →

Security

Penetration test results, SOC 2 report, and responsible disclosure.

Security details →

Status Page

Live uptime and incident history for all RTD services.

View status →