Mannequin Context Protocol (MCP) has develop into the usual method for AI brokers to make use of exterior instruments. However there’s a pressure at its core: brokers want many instruments to do helpful work, but each device added fills the mannequin’s context window, leaving much less room for the precise activity.
Code Mode is a way we first launched for decreasing context window utilization throughout agent device use. As a substitute of describing each operation as a separate device, let the mannequin write code towards a typed SDK and execute the code safely in a Dynamic Employee Loader. The code acts as a compact plan. The mannequin can discover device operations, compose a number of calls, and return simply the info it wants. Anthropic independently explored the identical sample of their Code Execution with MCP publish.
Right this moment we’re introducing a brand new MCP server for the complete Cloudflare API — from DNS and Zero Belief to Staff and R2 — that makes use of Code Mode. With simply two instruments, search() and execute(), the server is ready to present entry to your complete Cloudflare API over MCP, whereas consuming solely round 1,000 tokens. The footprint stays mounted, regardless of what number of API endpoints exist.
For a big API just like the Cloudflare API, Code Mode reduces the variety of enter tokens utilized by 99.9%. An equal MCP server with out Code Mode would devour 1.17 million tokens — greater than your complete context window of probably the most superior basis fashions.
Code mode financial savings vs native MCP, measured with tiktoken
You can begin utilizing this new Cloudflare MCP server right now. And we’re additionally open-sourcing a brand new Code Mode SDK within the Cloudflare Brokers SDK, so you should utilize the identical strategy in your personal MCP servers and AI Brokers.
This new MCP server applies Code Mode server-side. As a substitute of hundreds of instruments, the server exports simply two: search() and execute(). Each are powered by Code Mode. Right here is the complete device floor space that will get loaded into the mannequin context:
[
{
"name": "search",
"description": "Search the Cloudflare OpenAPI spec. All $refs are pre-resolved inline.",
"inputSchema": {
"type": "object",
"properties": {
"code": {
"type": "string",
"description": "JavaScript async arrow function to search the OpenAPI spec"
}
},
"required": ["code"]
}
},
{
"name": "execute",
"description": "Execute JavaScript code against the Cloudflare API.",
"inputSchema": {
"type": "object",
"properties": {
"code": {
"type": "string",
"description": "JavaScript async arrow function to execute"
}
},
"required": ["code"]
}
}
]
To find what it will possibly do, the agent calls search(). It writes JavaScript towards a typed illustration of the OpenAPI spec. The agent can filter endpoints by product, path, tags, or every other metadata and slender hundreds of endpoints to the handful it wants. The total OpenAPI spec by no means enters the mannequin context. The agent solely interacts with it by code.
When the agent is able to act, it calls execute(). The agent writes code that may make Cloudflare API requests, deal with pagination, examine responses, and chain operations collectively in a single execution.
Each instruments run the generated code inside a Dynamic Employee isolate — a light-weight V8 sandbox with no file system, no setting variables to leak by immediate injection and exterior fetches disabled by default. Outbound requests might be explicitly managed with outbound fetch handlers when wanted.
Instance: Defending an origin from DDoS assaults
Suppose a consumer tells their agent: “protect my origin from DDoS attacks.” The agent’s first step is to seek the advice of documentation. It’d name the Cloudflare Docs MCP Server, use a Cloudflare Ability, or search the net instantly. From the docs it learns: put Cloudflare WAF and DDoS safety guidelines in entrance of the origin.
Step 1: Seek for the appropriate endpoints
The search device provides the mannequin a spec object: the complete Cloudflare OpenAPI spec with all $refs pre-resolved. The mannequin writes JavaScript towards it. Right here the agent seems for WAF and ruleset endpoints on a zone:
async () => {
const outcomes = [];
for (const [path, methods] of Object.entries(spec.paths)) {
if (path.consists of('/zones/') &&
(path.consists of('firewall/waf') || path.consists of('rulesets'))) {
for (const [method, op] of Object.entries(strategies)) {
outcomes.push({ methodology: methodology.toUpperCase(), path, abstract: op.abstract });
}
}
}
return outcomes;
}
The server runs this code in a Staff isolate and returns:
[
{ "method": "GET", "path": "/zones/{zone_id}/firewall/waf/packages", "summary": "List WAF packages" },
{ "method": "PATCH", "path": "/zones/{zone_id}/firewall/waf/packages/{package_id}", "summary": "Update a WAF package" },
{ "method": "GET", "path": "/zones/{zone_id}/firewall/waf/packages/{package_id}/rules", "summary": "List WAF rules" },
{ "method": "PATCH", "path": "/zones/{zone_id}/firewall/waf/packages/{package_id}/rules/{rule_id}", "summary": "Update a WAF rule" },
{ "method": "GET", "path": "/zones/{zone_id}/rulesets", "summary": "List zone rulesets" },
{ "method": "POST", "path": "/zones/{zone_id}/rulesets", "summary": "Create a zone ruleset" },
{ "method": "GET", "path": "/zones/{zone_id}/rulesets/phases/{ruleset_phase}/entrypoint", "summary": "Get a zone entry point ruleset" },
{ "method": "PUT", "path": "/zones/{zone_id}/rulesets/phases/{ruleset_phase}/entrypoint", "summary": "Update a zone entry point ruleset" },
{ "method": "POST", "path": "/zones/{zone_id}/rulesets/{ruleset_id}/rules", "summary": "Create a zone ruleset rule" },
{ "method": "PATCH", "path": "/zones/{zone_id}/rulesets/{ruleset_id}/rules/{rule_id}", "summary": "Update a zone ruleset rule" }
]
The total Cloudflare API spec has over 2,500 endpoints. The mannequin narrowed that to the WAF and ruleset endpoints it wants, with none of the spec coming into the context window.
The mannequin may drill into a selected endpoint’s schema earlier than calling it. Right here it inspects what phases can be found on zone rulesets:
async () => {
const op = spec.paths['/zones/{zone_id}/rulesets']?.get;
const gadgets = op?.responses?.['200']?.content material?.['application/json']?.schema;
// Stroll the schema to seek out the part enum
const props = gadgets?.allOf?.[1]?.properties?.consequence?.gadgets?.allOf?.[1]?.properties;
return { phases: props?.part?.enum };
}
{
"phases": [
"ddos_l4", "ddos_l7",
"http_request_firewall_custom", "http_request_firewall_managed",
"http_response_firewall_managed", "http_ratelimit",
"http_request_redirect", "http_request_transform",
"magic_transit", "magic_transit_managed"
]
}
The agent now is aware of the precise phases it wants: ddos_l7 for DDoS safety and http_request_firewall_managed for WAF.
Step 2: Act on the API
The agent switches to utilizing execute. The sandbox will get a cloudflare.request() consumer that may make authenticated calls to the Cloudflare API. First the agent checks what rulesets exist already on the zone:
async () => {
const response = await cloudflare.request({
methodology: "GET",
path: `/zones/${zoneId}/rulesets`
});
return response.consequence.map(rs => ({
identify: rs.identify, part: rs.part, type: rs.type
}));
}
[
{ "name": "DDoS L7", "phase": "ddos_l7", "kind": "managed" },
{ "name": "Cloudflare Managed","phase": "http_request_firewall_managed", "kind": "managed" },
{ "name": "Custom rules", "phase": "http_request_firewall_custom", "kind": "zone" }
]
The agent sees that managed DDoS and WAF rulesets exist already. It may well now chain calls to examine their guidelines and replace sensitivity ranges in a single execution:
async () => {
// Get the present DDoS L7 entrypoint ruleset
const ddos = await cloudflare.request({
methodology: "GET",
path: `/zones/${zoneId}/rulesets/phases/ddos_l7/entrypoint`
});
// Get the WAF managed ruleset
const waf = await cloudflare.request({
methodology: "GET",
path: `/zones/${zoneId}/rulesets/phases/http_request_firewall_managed/entrypoint`
});
}
This whole operation, from looking the spec and inspecting a schema to itemizing rulesets and fetching DDoS and WAF configurations, took 4 device calls.
The Cloudflare MCP server
We began with MCP servers for particular person merchandise. Need an agent that manages DNS? Add the DNS MCP server. Need Staff logs? Add the Staff Observability MCP server. Every server exported a hard and fast set of instruments that mapped to API operations. This labored when the device set was small, however the Cloudflare API has over 2,500 endpoints. No assortment of hand-maintained servers might sustain.
The Cloudflare MCP server simplifies this. Two instruments, roughly 1,000 tokens, and protection of each endpoint within the API. After we add new merchandise, the identical search() and execute() code paths uncover and name them — no new device definitions, no new MCP servers. It even has help for the GraphQL Analytics API.
Our MCP server is constructed on the most recent MCP specs. It’s OAuth 2.1 compliant, utilizing Staff OAuth Supplier to downscope the token to chose permissions authorised by the consumer when connecting. The agent solely will get the capabilities the consumer explicitly granted.
For builders, this implies you should utilize a easy agent loop and nonetheless give your agent entry to the complete Cloudflare API with built-in progressive functionality discovery.
Evaluating approaches to context discount
A number of approaches have emerged to scale back what number of tokens MCP instruments devour:
Shopper-side Code Mode was our first experiment. The mannequin writes TypeScript towards typed SDKs and runs it in a Dynamic Employee Loader on the consumer. The tradeoff is that it requires the agent to ship with safe sandbox entry. Code Mode is carried out in Goose and Anthropics Claude SDK as Programmatic Instrument Calling.
Command-line interfaces are one other path. CLIs are self-documenting and reveal capabilities because the agent explores. Instruments like OpenClaw and Moltworker convert MCP servers into CLIs utilizing MCPorter to present brokers progressive disclosure. The limitation is clear: the agent wants a shell, which not each setting gives and which introduces a much wider assault floor than a sandboxed isolate.
Dynamic device search, as utilized by Anthropic in Claude Code, surfaces a smaller set of instruments hopefully related to the present activity. It shrinks context use however now requires a search perform that should be maintained and evaluated, and every matched device nonetheless makes use of tokens.
Every strategy solves an actual drawback. However for MCP servers particularly, server-side Code Mode combines their strengths: mounted token price no matter API measurement, no modifications wanted on the agent aspect, progressive discovery in-built, and protected execution inside a sandboxed isolate. The agent simply calls two instruments with code. All the pieces else occurs on the server.
The Cloudflare MCP server is on the market now. Level your MCP consumer on the server URL and you will be redirected to Cloudflare to authorize and choose the permissions to grant to your agent. Add this config to your MCP consumer:
{
"mcpServers": {
"cloudflare-api": {
"url": "
}
}
}
For CI/CD, automation, or in the event you choose managing tokens your self, create a Cloudflare API token with the permissions you want. Each consumer tokens and account tokens are supported and might be handed as bearer tokens within the Authorization header.
Extra info on completely different MCP setup configurations might be discovered on the Cloudflare MCP repository.
Code Mode solves context prices for a single API. However brokers hardly ever speak to 1 service. A developer’s agent would possibly want the Cloudflare API alongside GitHub, a database, and an inside docs server. Every extra MCP server brings the identical context window strain we began with.
Cloudflare MCP Server Portals allow you to compose a number of MCP servers behind a single gateway with unified auth and entry management. We’re constructing a first-class Code Mode integration for all of your MCP servers, and exposing them to brokers with built-in progressive discovery and the identical fixed-token footprint, no matter what number of companies sit behind the gateway.



