LLM Gateway Docs v2.0

Security

Protect your LLM endpoints with API keys, JWT/OIDC authentication, per-client rate limiting, TLS enforcement, and request/response content policies.

API Keys

Issue API keys to your clients through the admin API. Keys are hashed (SHA-256) at rest. Each key carries scopes and an optional expiry.

terminal — create an API key
# Create a key with chat scope, expiring in 90 days
curl -X POST https://your-gateway:8080/admin/api-keys \
  -H "Authorization: Bearer ${RTD_ADMIN_TOKEN}" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "my-app-prod",
    "scopes": ["chat", "embeddings"],
    "expiresIn": "90d"
  }'

# Response
{
  "id":  "key_01j9...",
  "key": "rtd-sk-xxxxxxxxxxxxxxxxxxxxxxxx",   # shown once
  "expiresAt": "2025-04-01T00:00:00Z"
}

Clients pass the key as Authorization: Bearer rtd-sk-... on every request.

gateway.yaml — enable API key auth
gateway:
  auth:
    apiKey:
      enabled: true
      header: "Authorization"   # or X-API-Key
      prefix: "Bearer "         # strip prefix before lookup

JWT Authentication

Validate JWTs issued by any OIDC-compatible identity provider. The gateway fetches and caches the JWKS automatically.

gateway.yaml — JWT
gateway:
  auth:
    jwt:
      enabled: true
      jwksUrl: "https://your-idp.example.com/.well-known/jwks.json"
      audience: "https://api.realtimedetect.com"
      issuer:   "https://your-idp.example.com"
      algorithms: ["RS256", "ES256"]
      # Claims to forward as headers to the backend
      forwardClaims:
        sub: "X-User-Id"
        org_id: "X-Org-Id"
      # Cache JWKS for this duration
      jwksCacheDuration: 60m

Note: If both API Key and JWT auth are enabled, the gateway accepts either. Use auth.requireAll: true to require both simultaneously.

Rate Limiting

Apply request-count or token-count rate limits per API key, per JWT subject, per IP, or globally. Limits use a sliding-window algorithm stored in Redis.

gateway.yaml — rate limiting
policies:
  rateLimit:
    # Global fallback limit
    global:
      requests: 1000
      period: 1m
      key: global

    # Per API key
    apiKey:
      requests: 200
      period: 1m
      burst: 30              # allow short bursts above the limit
      key: "api_key_id"

    # Per JWT subject (user)
    jwt:
      requests: 60
      period: 1m
      key: "jwt.sub"

    # Per IP (unauthenticated fallback)
    ip:
      requests: 20
      period: 1m
      key: "remote_ip"

    # Token-based limit (counts prompt + completion tokens)
    tokens:
      limit: 100000          # tokens per window
      period: 1h
      key: "api_key_id"

    # Redis backend for distributed rate limiting
    store:
      type: redis
      address: "${REDIS_URL}"

HTTP 429 Response

When a client exceeds their limit the gateway returns:

HTTP/1.1 429 Too Many Requests
{
  "error": {
    "code": "rate_limit_exceeded",
    "message": "Rate limit reached: 200 requests per minute. Retry after 14 seconds.",
    "type": "rate_limit_error",
    "param": null
  }
}

# Response headers
X-RateLimit-Limit:     200
X-RateLimit-Remaining: 0
X-RateLimit-Reset:     1714512034
Retry-After:           14

OAuth 2.0 / OIDC

Enable browser-based OAuth 2.0 flows so end users authenticate directly with your IdP. The gateway acts as the OAuth client and issues short-lived session tokens.

gateway.yaml — OAuth 2.0
gateway:
  auth:
    oauth2:
      enabled: true
      provider: "auth0"           # auth0 | azure-ad | okta | google | custom
      clientId:     "${OAUTH_CLIENT_ID}"
      clientSecret: "${OAUTH_CLIENT_SECRET}"
      redirectUrl:  "https://api.realtimedetect.com/auth/callback"
      scopes: ["openid", "profile", "email"]
      # For custom providers
      # authorizationUrl: "https://..."
      # tokenUrl: "https://..."
      sessionDuration: 24h

TLS

Terminate TLS at the gateway with your own certificate or auto-provision via Let's Encrypt.

gateway.yaml — TLS
listeners:
  - name: "https"
    port: 443
    protocol: HTTPS
    tls:
      mode: TERMINATE
      certFile: "/etc/rtd/tls/tls.crt"
      keyFile:  "/etc/rtd/tls/tls.key"
      minVersion: "TLS1.2"
      ciphers:
        - TLS_AES_256_GCM_SHA384
        - TLS_CHACHA20_POLY1305_SHA256

  # HTTP → HTTPS redirect
  - name: "http-redirect"
    port: 80
    protocol: HTTP
    redirect:
      scheme: "https"
      status: 301

Content Policy

Intercept and filter request prompts and/or model responses using built-in classifiers or a custom webhook.

gateway.yaml — content policy
policies:
  contentPolicy:
    enabled: true
    # Block prompts containing PII (built-in classifier)
    piiDetection:
      enabled: true
      block: true            # false = mask only
      entities: ["EMAIL", "PHONE", "SSN", "CREDIT_CARD"]

    # Block harmful categories
    moderation:
      enabled: true
      categories:
        - hate
        - violence
        - self-harm

    # Custom webhook — return 200 to allow, 4xx to block
    webhook:
      url: "${CONTENT_POLICY_WEBHOOK_URL}"
      timeout: 2s