LLM Gateway Docs v2.0

Quickstart

Deploy your first LLM Gateway and route a request to an AI provider in under 5 minutes.

1Prerequisites2Install3Configure4Start Gateway5First Request6Next Steps
1

Prerequisites

Before you begin, make sure you have the following:

  • Docker 24+ or Node.js 20+for running the gateway
  • An API key from at least one LLM providere.g. OpenAI, Anthropic
  • curl or any HTTP clientto test your first request
2

Install

The fastest way to run LLM Gateway is via Docker:

bash
docker pull realtimedetect/llm-gateway:2.0
docker run -d \
  --name rtd-gateway \
  -p 8080:8080 \
  -p 9090:9090 \
  -v $(pwd)/gateway.yaml:/etc/rtd/gateway.yaml \
  -e OPENAI_API_KEY=sk-... \
  realtimedetect/llm-gateway:2.0

Alternatively, install via npm:

bash
npm install -g @realtimedetect/gateway
rtd-gateway start --config gateway.yaml

Note

Replace sk-... with your actual OpenAI API key. You can set multiple provider keys as environment variables.
3

Configure

Create a gateway.yaml file in your working directory:

gateway.yaml
gateway:
  name: "my-llm-gateway"
  version: "2.0"

listeners:
  - name: "http"
    port: 8080
    protocol: HTTP

routes:
  - name: "chat-completions"
    path: "/v1/chat/completions"
    methods: ["POST"]
    backend: "openai-backend"
    policies:
      - rateLimit: "default"
      - timeout: 30s

backends:
  - name: "openai-backend"
    type: openai
    config:
      apiKey: "${OPENAI_API_KEY}"
      defaultModel: "gpt-4o"

policies:
  rateLimits:
    - name: "default"
      requests: 60
      period: 1m
      key: api_key

Tip

The gateway supports hot-reload. Any changes to gateway.yaml are picked up automatically — no restart required.
4

Start the Gateway

Once your config is in place, start (or restart) the container. Verify it's ready:

bash
curl -s http://localhost:8080/health | jq .

Expected response:

json
{
  "status": "healthy",
  "version": "2.0.0",
  "providers": {
    "openai": "connected"
  },
  "uptime": "12s"
}
5

Make Your First Request

The gateway is OpenAI-compatible. Send a request the same way you would to OpenAI directly:

bash
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <YOUR_RTD_API_KEY>" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      { "role": "user", "content": "What is an LLM Gateway?" }
    ]
  }'

You should receive a standard OpenAI-format response:

json
{
  "id": "chatcmpl-RTD123abc",
  "object": "chat.completion",
  "model": "gpt-4o",
  "choices": [{
    "message": {
      "role": "assistant",
      "content": "An LLM Gateway is a reverse proxy that sits in front of..."
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 15,
    "completion_tokens": 48,
    "total_tokens": 63
  },
  "_rtd": {
    "provider": "openai",
    "latency_ms": 342,
    "cost_usd": 0.000378
  }
}

Tip

Notice the _rtd field appended to every response — it contains gateway metadata: the provider used, response latency, and per-request cost.

View Metrics

Prometheus-format metrics are available on port 9090:

bash
curl http://localhost:9090/metrics | grep rtd_
text
rtd_requests_total{provider="openai",model="gpt-4o",status="200"} 1
rtd_latency_ms_p99{provider="openai"} 342
rtd_cost_usd_total{provider="openai"} 0.000378
rtd_tokens_total{type="prompt"} 15
rtd_tokens_total{type="completion"} 48

You're all set!

Your LLM Gateway is running. Here's what to explore next: