2026 marks the year agent harnesses move into real-world production. The software layer that manages how models interact with external systems — harnesses such as Codex, Claude Code, OpenCode, Pi, and Project Think — has evolved to the point where organizations are deploying agents as dependable, production-grade infrastructure rather than experimental demos.
However, creating agents that hold up under production demands is no simple task.
We discovered this through direct experience while building Project Think as our own first-party agent harness. As we collaborated with customers to run agents in live environments, we encountered a recurring set of distributed systems challenges that every cloud-deployed agent must confront. When an agent gets interrupted, how does it seamlessly pick up from where it stopped — preserving context and avoiding wasted tokens? How can agents execute untrusted code in a secure manner? How can agents access the tools they were specifically trained to use?
No harness can tackle these issues in isolation. They’re fundamentally connected to state management, storage, and compute — which means they depend on the underlying platform where the agent operates. That’s exactly why we’re taking the hard-won lessons from making Project Think production-ready and embedding them into the Cloudflare Agents SDK as foundational capabilities. Durable execution, dynamic code execution, a persistent filesystem, and dynamic workflows are now accessible to any harness built on top of Agents SDK.
Meanwhile, a new tier has taken shape above the harness layer. Frameworks such as Flue enhance a harness with project scaffolding, established conventions, integrations, and a developer experience that make building agents far more efficient.
To address these scaling challenges, a new three-tier architecture is taking shape for building production-grade AI systems. Here’s how the layers connect, starting from the developer-facing experience and going down to the core platform primitives:
The framework (Flue) — the project scaffolding, established conventions, integrations, command-line interface, and overall developer experience for constructing agents.
The harness (Pi, Project Think) — the agentic loop responsible for invoking tools, processing results, managing context, and continuing until the task reaches completion.
The runtime/platform (the Cloudflare Agents SDK) — the compute, state, and storage primitives that everything above relies on.
The Agents SDK serves as that foundational layer: it exposes primitives like durable execution to any harness or framework built on top of it. Flue, our new open-source framework created by the team behind Astro, is the first to leverage this foundation. Here’s how it works.
Flue released its 1.0 Beta this week, built on the Pi harness — the same harness that powers OpenClaw. What sets Flue apart as an agent framework is its philosophy: rather than scripting your agent’s every action, you describe what it should know. Specify the context an agent requires — its model, skills, sandbox environment, and instructions — and it independently figures out how to complete whatever task you assign. There’s no orchestration loop to manually code.
This declarative approach is what makes building agents straightforward: consider a triage agent that catches a bug report, reproduces the issue inside a sandbox, and pinpoints the root cause — all in under 25 lines of code.
The Flue developer experience
Flue’s strength lies in the fact that agents don’t operate in a vacuum. They’re designed to live where your users already spend their time and connect with the tools you already prefer:
Agents everywhere: Place your agents into Slack, GitHub, Linear, or Discord using pre-built Channels that automatically handle event verification and dispatch boilerplate.
Headless, yet UI-capable: Agents shouldn’t be opaque black boxes. Flue agents can run entirely headlessly for background operations, but @flue/react offers native frontend hooks that stream an agent’s state, tool execution, and live messages directly into your frontend application — eliminating the need to build custom real-time infrastructure from the ground up.
Built for the ecosystem: Flue simplifies adding and upgrading integrations through commands like
flue add channel slack, which generates a Markdown blueprint that your own coding agent can read, modify, and cleanly weave into your codebase.
Built for production, not just prototypes
Transitioning an agent from a local terminal into a production environment brings classic distributed systems failures into play. Server crashes, API timeouts from LLM providers, and unexpected restarts all threaten to wipe out the short-term memory of an active agent turn.
Flue addresses this through Durable Streams. Every event in the execution history gets appended to an immutable log. By treating every prompt, tool response, and model decision as a permanent record, an agent’s state is never at risk of being lost. If a process crashes, another one simply picks up the log and resumes from the precise step where it left off.
Deploy anywhere, including Cloudflare
Flue is a multi-cloud framework. On Node.js, each agent operates as a long-running process. You can deploy it to any virtual machine or container, execute it within GitHub Actions, or embed it on an existing server. But when you target Cloudflare, each agent becomes a Durable Object.
By running every Flue agent inside its own Durable Object, Cloudflare can automatically scale to however many agents you need — each with fully isolated storage and compute. There’s no server provisioning, no sticky session management, and no concerns about noisy neighbors. And when Flue agents are deployed to Cloudflare, they gain durable execution through Agents SDK’s runFiber(), stash(), and onFiberRecovered() methods. Flue also leverages @cloudflare/codemode
and @cloudflare/shell for sandboxed code execution against a durable workspace.
What harnesses expect from an agentic platform
Flue’s Cloudflare target delivers strong results because it aligns directly with the foundational building blocks embedded in the Agents SDK. You can even explore the Flue codebase to see how Pi, the core harness, is modified to function on the Cloudflare Agents SDK.
Below is a breakdown of how Flue taps into the Agents SDK behind the scenes, along with what it requires to run any modern agent harness reliably at scale.
Every agent harness demands persistent execution
A single agent turn isn’t just one API call. The model generates tokens in a stream, invokes tools, pauses for responses, occasionally requests human approval, or hands tasks off to subagents. That entire flow can span seconds or even minutes, and at any moment the process could be interrupted or fail. When that occurs, all in-memory state is lost: the active stream, the pending tool invocations, and the agent’s position within its turn. While the conversation log may be saved to disk, the user is left staring at a spinner that never completes. That’s a degraded experience.
Fibers address this by offering a built-in checkpointing system within the Agent’s underlying Durable Object. runFiber() writes progress to the Durable Object’s SQLite store before an Agent turn begins, and checkpoints progress with stash() as the turn proceeds. When a new agent instance starts up after a disruption, onFiberRecovered() restores the last saved checkpoint, so your agent can detect an interruption, identify where things stopped, and determine how to proceed.
import { Agent } from "agents";
import type { FiberRecoveryContext } from "agents";
class MyAgent extends Agent {
async doWork() {
await this.runFiber("my-task", async (ctx) => {
const step1 = await expensiveOperation();
ctx.stash({ step1 });
const step2 = await anotherExpensiveOperation(step1);
this.setState({ ...this.state, result: step2 });
});
}
async onFiberRecovered(ctx: FiberRecoveryContext) {
if (ctx.name !== "my-task") return;
const { step1 } = (ctx.snapshot ?? {}) as { step1?: unknown };
if (step1) {
const step2 = await anotherExpensiveOperation(step1);
this.setState({ ...this.state, result: step2 });
}
}
}Flue relies on runFiber() within its Cloudflare target for precisely this reason. By using the onFiberRecovered() hook, your harness gains control over how to resume a turn, whether it adopts a full reconstruction strategy like Project Think that repairs turn state or selectively replays portions of the turn.
Running code beats flooding agents with tools
Agent harnesses connect models to external services through tools. But tool interfaces multiply quickly, and model performance degrades as the list grows longer and tool definitions consume more of the context window. A more effective approach: give the model a single tool that runs code. The model writes a TypeScript function that calls whichever APIs it needs, and the harness executes it. We covered this idea when we launched Code Mode.
The key question becomes where that code should execute. Safely running LLM-generated code requires a sandbox. However, conventional sandboxes would be slow, expensive, and wasteful for each tool invocation. That’s why the Agents SDK includes @cloudflare/codemode, which wraps Dynamic Workers, enabling LLM-generated code to run inside its own Worker isolate with only the bindings you specify.
Code Mode spins up a new Dynamic Worker for each code snippet, executes it, and tears it down. Isolates initialize in under 10ms and cost merely $0.002 per invocation, making execution dramatically faster and cheaper than launching a container every time your agent needs to run a small piece of code. Flue integrates @cloudflare/codemode in its Cloudflare target to drive its code tool. The model writes JavaScript targeting the workspace and executes it via Code Mode.
Full containers are overkill for most workspace operations
Agent harnesses frequently require a filesystem, whether for reading documents, writing outputs, browsing code, or understanding diffs. Coding-focused agents especially depend on filesystem access. But if your harness runs in a serverless environment, how do you provide a persistent filesystem that survives across invocations?
The typical solution is a container. It works, but it’s costly relative to what agents actually do. Most filesystem activity in an agent turn involves text. Think of a review agent that reads files, searches through source code, or perhaps generates a patch. You don’t need a full Linux environment for that.
@cloudflare/shell gives your agent a durable filesystem built on
Flue includes a virtual file system within its Durable Object, powered by SQLite. It offers typed file operations — such as read, write, edit, search, grep, and diff — that agent harnesses can use as tools.
Rather than calling individual tools, a Flue agent running on the Cloudflare target writes JavaScript against the workspace virtual file state API. By running more operations within the Durable Object, the agent benefits from the isolate model’s more efficient execution process, entirely avoiding container overhead:
async () => {
const files = await state.glob("src/**/*.ts");
const results = [];
for (const file of files) {
const content = await state.readFile(file);
const todos = content.match(/// TODO:.*/g);
if (todos) results.push({ file, todos });
}
return results;
}This translates into a faster and more cost-efficient sandbox environment for agents that need to run shell and filesystem operations to get their work done. And for agents that need a full OS, to run npm install, git, or compilers, Cloudflare Containers provides that. We’re also building @cloudflare/workspace, to keep the virtual file system of a given Durable Object in sync with a container’s, allowing for seamless transition from lightweight Workers to a Linux environment only when it needs one.
Dynamic Workflows: let agents write their own workflows to repeat tasks consistently
But what happens when an agent needs to do more than read files or execute single code snippets? What happens when it needs to orchestrate a massive, multi-step pipeline that must repeat consistently over time, like a code review that successfully resolves bugs or a research workflow that produces good results? A harness can’t provide durable multi-step execution on its own. It needs the platform to persist each step, retry failures, and resume after interruptions.
This pattern is gaining traction. Claude Code recently shipped dynamic workflows, where Claude writes a JavaScript script at runtime to hand off work to dozens of subagents, and the runtime executes it durably. @cloudflare/dynamic-workflows provides this for any harness running on the Agents SDK. Your agent generates a workflow at runtime, and the Workflows engine persists each step, retries failures, and can sleep for hours or wait for external events like human approval.
From the Agent class, runWorkflow() connects your agent to the Workflows engine. The agent kicks off the workflow and can go to sleep. The workflow calls back into the agent via RPC to report progress, update state, or request approval. When the workflow finishes, the agent wakes up with the result.
Direct access to the Cloudflare ecosystem
Beyond compute and storage, agent harnesses need access to external capabilities: web browsing, email, memory, search, inference. A harness shouldn’t have to integrate each of these separately, manage API keys for each, or worry about credentials leaking through agent-generated code.
The Agent class gives your harness access to the rest of Cloudflare through bindings: AI Gateway for per-agent spend tracking and limits, Browser Run for web automation, Email Service for inbox workflows, Agent Memory for persistent recall, AI Search for retrieval, Containers for workloads that need a full OS, and inference across 14+ model providers. Bindings grant capabilities without exposing credentials: your agent uses them, but the keys never enter agent-generated code.
Bring your agents to the agentic cloud
We know this approach works because it is the exact architectural foundation we used to build Project Think, our first-party agent harness. While Project Think remains our highly optimized, out-of-the-box solution for native Cloudflare agent experiences, the Agents SDK ensures that the broader open-source ecosystem can leverage those exact same battle-tested primitives, including Flue.
If you’re building agents today with Flue, you can deploy in just a few clicks to Cloudflare. And if you’re building your own agent harness or you’re building an agent framework, target the Agents SDK and get the platform integration for free.



