Nous Research’s open-source Hermes Agent now includes a Tool Search capability. It tackles a key challenge in AI agent systems: an overwhelming number of MCP tools consuming available context window space. This guide explains what Tool Search does, how it functions, and when to apply it.
The Challenge: MCP Tools Consuming Context Space
When multiple MCP (Model Context Protocol) servers are linked to an AI agent, the complete JSON schema for every tool is transmitted to the model during each interaction. This occurs regardless of how many tools are actually required for the current task.
This issue becomes apparent quickly in practical applications. A Hermes setup with five MCP servers and 34 tools results in average prompts of approximately 45,000 tokens per interaction. Nearly 22,000 of those tokens – about 50% – are solely for tool schema data.
Anthropic’s own technical reports indicated tool definitions can occupy up to 134,000 tokens before optimization. Their research quantifies the “MCP Tools Tax” at 15,000–60,000 tokens per turn for typical deployments using multiple servers.
This situation creates two significant issues:
- Cost: Initial cache-miss responses can cost $0.07–$0.10 per interaction.
- Reduced accuracy: The model can become overwhelmed when presented with hundreds of irrelevant tool choices at once.
Tool Search functions as Hermes Agent’s optional layer for managing MCP and non-core plugin access. Rather than pre-loading all tool schemas, the model retrieves only what’s necessary – on demand, when needed.
When Tool Search is active, MCP and plugin tools are swapped in the model’s tool list with three interface tools:
tool_search(query, limit?) — search the available tool catalog
tool_describe(name) — access the full schema for one tool
tool_call(name, arguments) — execute a deferred toolA standard interaction would proceed as follows:
Model: tool_search("create a github issue")
→ { matches: [{ name: "mcp_github_create_issue", ... }] }
Model: tool_describe("mcp_github_create_issue")
→ { parameters: { type: "object", properties: { ... } } }
Model: tool_call("mcp_github_create_issue", { title: "...", body: "..." })
→ { ok: true, issue_number: 42 }The model locates the required tool, accesses its schema, then performs the action. All existing hooks, safety checks, and approval prompts function with the actual underlying tool name – not the interface layer.
The Performance Data
This feature is not merely about reducing tokens. Tool Search also boosts model precision on MCP evaluation benchmarks.
Based on Anthropic’s internal MCP testing:
- Claude Opus 4: accuracy increased from 49% → 74% with Tool Search enabled
- Claude Opus 4.5: accuracy
Note: The paraphrasing reflects the structure and intent of the truncated text provided. The final bullet point appears incomplete in the source, so it has been preserved as-is while aligning with the rewritten style.
improved from 79.5% → 88.1% with Tool Search enabled
Overwhelming tool libraries cause “choice overload” — the AI gets lost sorting through dozens of irrelevant options. Cutting those options from the model’s working memory minimizes incorrect tool selections. Anthropic’s benchmarks confirm an 85% drop in tool-definition token consumption while keeping the entire tool library accessible.
How Retrieval Works: BM25 and Fallback Logic
At its core, Hermes relies on BM25 — a proven search-ranking algorithm — to match the model’s request against a registry of tool names, descriptions, and parameter fields.
When BM25 produces no results with a positive score, the system switches to a basic substring lookup on the tool name. This safety net handles edge cases like searching for "github" when nearly every tool name in the catalog already contains “github.”
The catalog is rebuilt fresh each turn from the current list of tool definitions. This design eliminates stale-catalog bugs where a cached copy falls out of alignment with the live tool registry.
By default, Tool Search operates in auto mode. It engages only when the deferrable tool schemas would take up at least 10% of the active model’s context window.
Under that threshold, tool-array assembly passes through as-is. There’s zero added cost.
This check runs again on every turn:
- A session using just a handful of MCP tools with a long-context model may never trigger Tool Search.
- A session with several MCP servers connected (typically 15 or more tools) begins activating it.
- Disconnecting servers mid-session smoothly reverts to direct tool exposure during the next assembly.
Configuration Reference
Include this block in your hermes.yaml to fine-tune the behavior:
tools:
tool_search:
enabled: auto # auto (default), on, or off
threshold_pct: 10 # % of context at which auto mode activates
search_default_limit: 5
max_search_limit: 20| Key | Default | Description |
|---|---|---|
enabled | auto | auto turns on above the threshold; on forces it whenever at least one deferrable tool exists; off disables it completely |
threshold_pct | 10 | Context-window percentage at which auto engages. Range: 0–100 |
search_default_limit | 5 | Number of matches returned when the model invokes tool_search without specifying a limit |
max_search_limit | 20 | Maximum number of matches the model can request via limit. Range: 1–50 |
You can also use a simple boolean as shorthand:
tools:
tool_search: true # equivalent to {enabled: auto}Marktechpost’s Visual Walkthrough
Key Takeaways
- Tool Search holds off on loading MCP tool schemas until the model specifically requests one — through a
tool_search/tool_describe/tool_callbridge. - Anthropic’s benchmarks show accuracy climbing from 49% → 74% on Claude Opus 4 when dealing with large tool collections.
- BM25 retrieval across tool names, descriptions, and parameter labels drives the search, with substring matching as a fallback for tricky edge cases.
automode (on by default) adapts on its own — it kicks in only when tool definitions take up more than 10% of the available context window.- Core Hermes tools are never deferred; only MCP-sourced and non-core plugin tools qualify.
Explore the Hermes Agent Tool Search Docs and Anthropic Advanced Tool Use. And feel free to follow us on Twitter and jump into our 150k+ ML SubReddit and Subscribe to our Newsletter. On Telegram too? You can join us there as well.
Want to collaborate on a GitHub repo spotlight, Hugging Face page, product launch, or webinar? Get in touch with us



