When building complex AI agents, it’s common to provide the model with a massive list of available functions (e.g.,
search_database, send_email, calculate_math, create_ticket).
However, sending 50 tool definitions in a single request when the user just said “Hello” causes two major problems:
- Token Waste: Defining tools consumes a massive amount of input tokens.
- Hallucination Risk: The more tools a model sees, the more likely it is to get confused and try to call the wrong tool.
tools array before sending it to the upstream model.
How Tool Optimization Works
When a request containing atools array arrives, our internal Gateway AI acts as a Tool Selector Agent.
- It analyzes the user’s prompt against the descriptions of all provided tools.
- It assigns a Relevance Score (0.0 to 1.0) to each tool.
- It detects Dependencies (e.g., if
create_elementrequiresget_context, both are scored highly). - If a tool’s score falls below your configured threshold (
acceptScore), it is stripped from the request entirely.
Configuration
You configure this behavior inside thegateway.toolOptimization object.
TypeScript
What happens in this example?
- The user asks a math question.
- The Gateway scores
calculatorat 1.0. It scoressend_emailat 0.0. - It sees
search_dbin youralwaysIncludearray. - Result: The router strips
send_emailand sends ONLYcalculatorandsearch_dbto Claude. You save tokens, and Claude doesn’t get distracted.
Configuration Properties
The toolOptimization Object
| Property | Type | Default | Description |
|---|---|---|---|
enabled | boolean | true | Toggles the probabilistic tool filtering engine on or off. |
acceptScore | number | 0.5 | The minimum relevance score (0.0 to 1.0) required for a tool to be included in the final request to the LLM. |
alwaysInclude | string[] | [] | An array of exact function names (e.g., ["get_weather"]). These tools will always be sent to the LLM, bypassing the scoring engine entirely. |