Introduction - LLM Router

LLM Router is a blazing-fast, developer-first AI Gateway. It sits between your application and AI providers (like OpenAI, Anthropic, and Google) to dynamically optimize every request for cost, latency, capabilities, and security. Stop paying Opus-4.6 prices for simple tasks, and stop leaking sensitive PII to third-party models. LLM Router gives you granular control over your AI infrastructure with zero code rewrites.

Zero Friction: LLM Router is a 100% drop-in replacement for the OpenAI API. Just change your baseURL and apiKey and you are instantly optimized.

Why LLM Router?

Building AI applications is easy. Scaling them efficiently is hard. We built LLM Router to solve the biggest headaches in AI engineering:

Skyrocketing Token Costs: Developers waste millions of tokens sending bloated chat histories, unused tools, and heavy images to expensive models.
Model Lock-in & Downtime: When OpenAI goes down, your app goes down.
Data Privacy: Users accidentally paste passwords, API keys, and PII (e.g: Credit cards) into chat prompts.
Lack of Control — Fine-tune exactly how and where each request is handled.

Routing Configuration: Dashboard vs Per-Request

You have full flexibility when configuring routing behavior:

Via the LLM Router Dashboard (Recommended for most cases)
Configure default routing rules, tags, models, and preferences directly on your API keys. This includes assigning tags (like coding, ui design, testing), enabling Zero Data Retention (ZDR), context optimization settings, and more. These settings apply automatically to all requests using that API key.
Via the Request Payload (Per-request override)
You can also pass routing configuration directly in every API call using the gateway object. This gives you maximum flexibility for dynamic behavior.

Important: Any configuration sent in the request overrides the settings defined on the API key in the dashboard. This allows you to have safe defaults while still customizing behavior for specific workflows or users.

Core Capabilities

Intelligent Tag Routing

Assign tags to API keys or pass them per request. We combine your business rules with real-time prompt analysis to route to the best model automatically.

Aggressive Context Pruning

Reduce input costs by up to 80%. Automatically drop irrelevant history, strip unused tools, and remove unnecessary media.

Plug-and-Play Skills

Install Skills from the catalog or your own GitHub repos. Dynamically inject targeted instructions only when relevant.

Zero Data Retention (ZDR)

Enforce strict privacy by routing only to providers with Zero Data Retention guarantees — per key or per request.

How It Works

LLM Router acts as an intelligent proxy. When a request comes in from your app, our internal engine analyzes the prompt. Depending on your configured rules and tags, it will:

Redact sensitive data.
Prune bloated context.
Attach requested Skills to the prompt.
Score the complexity of the request.
Route the optimized prompt to the most cost-effective upstream provider.

The “Aha!” Moment

Integrating LLM Router doesn’t require learning a new SDK. You just change one line of code in your existing app.

import OpenAI from "openai";

// 1. Point the official OpenAI client to LLM Router
const client = new OpenAI({
  baseURL: "https://api.llmrouter.app/v1", // <-- Just change this
  apiKey: "sk_llmr_...", // <-- And use your Router Key
});

// 2. The router handles complexity analysis and fallback automatically
const response = await client.chat.completions.create({
  messages: [{ role: "user", content: "Explain quantum physics." }],
  //@ts-expect-error
  gateway: {
    tags: [
      {
        name: "coding",
        description:
          "Code generation, refactoring, debugging, and software development",
        models: [
          "anthropic/claude-opus-4.6",
          "openai/gpt-5.3-codex",
          "deepseek/deepseek-v3.2",
        ],
      },
      {
        name: "ui design",
        description:
          "UI/UX design, component generation, Tailwind, Figma-like descriptions, and frontend aesthetics",
        models: [
          "google/gemini-3.1-pro",
          "openai/gpt-5.3-codex",
          "xai/grok-4.20",
        ],
      },
      {
        name: "testing",
        description:
          "Writing unit tests, integration tests, test-driven development, and QA",
        models: [
          "openai/gpt-5.4",
          "deepseek/deepseek-v3.2",
          "mistral/mistral-large-3",
        ],
      },
    ],
    imageGenerationModel: "google/gemini-3-pro-image", //Used when the request involves creating images.
  },
});

from openai import OpenAI

# 1. Point the official OpenAI client to LLM Router
client = OpenAI(
  base_url="https://api.llmrouter.app/v1", # <-- Just change this
  api_key="sk_llmr_..."                  # <-- And use your Router Key
)

# 2. The router handles complexity analysis and fallback automatically
response = client.chat.completions.create(
  messages=[{"role": "user", "content": "Explain quantum physics."}]
    gateway: {
    tags: [
      {
        name: "coding",
        description:
          "Code generation, refactoring, debugging, and software development",
        models: [
          "anthropic/claude-opus-4.6",
          "openai/gpt-5.3-codex",
          "deepseek/deepseek-v3.2",
        ],
      },
      {
        name: "ui design",
        description:
          "UI/UX design, component generation, Tailwind, Figma-like descriptions, and frontend aesthetics",
        models: [
          "google/gemini-3.1-pro",
          "openai/gpt-5.3-codex",
          "xai/grok-4.20",
        ],
      },
      {
        name: "testing",
        description:
          "Writing unit tests, integration tests, test-driven development, and QA",
        models: [
          "openai/gpt-5.4",
          "deepseek/deepseek-v3.2",
          "mistral/mistral-large-3",
        ],
      },
    ],
    imageGenerationModel: "google/gemini-3-pro-image", # Used when the request involves creating images.
  },
)

Billing & Responsibility

You are responsible for Stripe’s payment processing fees and any applicable routing costs charged by upstream AI providers (OpenAI, Anthropic, Google, etc.) — you pay the underlying model providers directly for token usage.

​Why LLM Router?

​Routing Configuration: Dashboard vs Per-Request

​Core Capabilities

Intelligent Tag Routing

Aggressive Context Pruning

Plug-and-Play Skills

Zero Data Retention (ZDR)

​How It Works

​The “Aha!” Moment

​Billing & Responsibility

Why LLM Router?

Routing Configuration: Dashboard vs Per-Request

Core Capabilities

How It Works

The “Aha!” Moment

Billing & Responsibility