Skip to main content

Large Language Models often fail at complex reasoning tasks (like intricate math, deep code refactoring, or multi-step logic puzzles) because they try to generate the final answer immediately. LLM Router solves this with Advanced Planning Routing. By chaining two models together, you can force a “Planner Model” to generate a strict, step-by-step strategy before an “Execution Model” writes the final response.

How Planning Works

When a request arrives, LLM Router analyzes the prompt and generates a Complexity Score (from 0.0 to 1.0). If this score exceeds your configured planningTriggerScore, LLM Router intercepts the request and performs a two-step process:
  1. The Planning Phase: It sends the prompt and conversation history to your designated Planner Model. This model is instructed via a strict System Prompt to only generate a logical execution plan, not the final answer.
  2. The Execution Phase: The resulting plan is injected directly into the system prompt of your designated Execution Model, which then generates the final output exactly as the user requested, guided by the flawless strategy.

Defining the Model Chain

You define this Planner/Executor relationship using a specific syntax in your model request or within your gateway.tags configuration. Syntax: executorModel:planning:plannerModel For example, if you want claude-3-5-sonnet to execute the code, but you want o1-mini (a specialized reasoning model) to plan it: anthropic/claude-3-5-sonnet:planning:openai/o1-mini

Configuration

You set the complexity threshold that triggers this behavior inside the gateway object.
TypeScript
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.llmrouter.app/v1",
  apiKey: process.env.LLM_ROUTER_API_KEY,
});

async function main() {
  const response = await client.chat.completions.create({
    // Define the Executor : planning : Planner
    model: "anthropic/claude-3-5-sonnet:planning:openai/o1-mini",

    messages: [
      {
        role: "user",
        content: "Design a high-throughput microservices architecture for...",
      },
    ],

    // @ts-expect-error - Custom LLM Router extension
    gateway: {
      // Planning will only trigger if the prompt's complexity score is > 0.6
      planningTriggerScore: 0.6,
    },
  });

  console.log(response.choices[0].message.content);
}
main();

Using Planning with Tags

You can also use this syntax directly inside your gateway.tags arrays to create incredibly smart, intent-based routing networks.
gateway: {
  tags: [
    {
      name: "complex_architecture",
      description: "System design, cloud architecture, database scaling",
      models: [
        // If the score > planningTriggerScore, this chain activates.
        "anthropic/claude-3-5-sonnet:planning:deepseek/deepseek-reasoner",

        // Standard fallback if the first chain fails
        "openai/gpt-4o",
      ],
    }
  ],
  planningTriggerScore: 0.75, // Only trigger planning for very hard tasks
}

Configuration Properties

The gateway Object

PropertyTypeDefaultDescription
planningTriggerScorenumber0.6The complexity threshold (0.0 to 1.0). If the internal request score is greater than > this number, the two-step planning chain is executed. If it is lower, the executorModel is called normally, bypassing the planner to save latency and costs.
Cost & Latency Considerations: Triggering a planning phase means you are making two LLM API calls instead of one. Set your planningTriggerScore high enough (e.g., 0.7 or 0.8) so that simple requests don’t waste time and money generating unnecessary plans.