Supercharging Cloudflare's App Universe: The Key Of Universal OAuth

Cloudflare powers roughly one-fifth of the internet, but we don’t do it all by themselves. Developers building on their platform draw on a wide range of external tools and services alongside Cloudflare’s own offerings. To help tie everything together, Cloudflare exposes a robust API that developers use to build automations, CI/CD pipelines, and integrations that connect the different pieces of their infrastructure. Earlier this month, they introduced self-managed OAuth, a feature that simplifies how customers create and manage their own OAuth clients for delegated access to the Cloudflare API.

Cloudflare is no stranger to OAuth. If you’ve used Wrangler or integrations from partners such as PlanetScale, you’ve already interacted with it. Until now, however, third-party OAuth access was restricted to a handful of manually onboarded integrations and wasn’t available to the broader developer community. That meant developers building custom integrations had no choice but to rely on API tokens, which are more cumbersome to manage and don’t suit many delegated-access scenarios particularly well.

Over the past year, Cloudflare onboarded a steadily growing set of early partners while simultaneously refining the consent flow, revocation mechanisms, and overall security posture behind their OAuth implementation. But as the Developer Platform expanded and AI-driven agent tools created surging demand for delegated access, it became evident that making OAuth available to all customers was essential to the platform’s long-term success.

With self-managed OAuth, developers can now implement a standard OAuth flow in which customers grant scoped permissions directly. This makes it far simpler to build SaaS integrations, internal developer platforms, and agentic tools, all while giving end users clearer consent prompts, straightforward revocation, and greater control over what each application is allowed to do.

Scaling the ecosystem securely

While the previous OAuth setup worked well enough for a small, tightly managed group of partners, the team recognized that their permissions model, consent experience, and abuse-mitigation strategies weren’t yet mature enough for broader exposure.

Earlier this year, they overhauled the consent experience so that it’s now much clearer which application is requesting access and exactly which permissions it will receive. They also added a revocation mechanism to the dashboard, giving developers an easy way to control which applications can access their data, and made app ownership more visible to help defend against OAuth phishing attacks.

Rolling out self-managed OAuth to every customer also demanded significant upgrades to the underlying OAuth engine. This undertaking required extensive planning so that the transition would cause minimal disruption to users while preserving data stability and security throughout.

Planning the upgrade to our OAuth engine

Several years ago, Cloudflare deployed Hydra, an open-source OAuth engine, to serve as the backbone of their OAuth infrastructure. That deployment performed reliably while usage was limited, but as the developer platform scaled and agentic workflows grew more prevalent, it became obvious that a major upgrade was needed to unlock new capabilities and boost performance.

During the planning phase, the team opted to carry out two smaller sequential upgrades rather than a single large leap. First, they would migrate to the latest 1.X release, assess any behavioral or performance differences, and only then proceed with the 2.X upgrade.

While mapping out the upgrade, they discovered that even the 1.X transition would affect customers, because the Hydra database required extensive schema migrations that:

Created indexes in a way that would acquire an exclusive lock on critical tables, blocking active users from performing essential OAuth operations
Added columns to critical tables and relocated other columns to entirely new tables

There was also an idiosyncrasy in the version of Hydra they were running: its SDK issued SELECT * queries, which triggered deserialization errors in the face of the schema changes.

To shield users from any impact, the team rewrote the SQL migrations to leverage features like CREATE INDEX CONCURRENTLY, and they built a custom build of Hydra that selected explicit columns instead of using SELECT *.

With the latest 1.X upgrade mapped out, the next step was to devise a strategy for the considerably larger 2.X migration. They identified three possible approaches and evaluated the trade-offs of each. An in-place upgrade was ruled out immediately, given the sheer volume of schema changes that the major version bump introduced. A blue-green deployment strategy seemed viable, but it required far more than simply toggling a switch to point traffic at the new version. The upgrade and migration process would span multiple hours, and the system had to continue operating correctly throughout that entire window.

The first blue-green approach they considered would involve halting all writes to the database, preventing any new authorizations from being created. This would ensure nothing was lost during the transition, but it also meant that no one could use existing OAuth applications unless they already held a valid credential. It introduced yet another serious problem: if a user needed to revoke an application’s access for any reason, they would be unable to do so while the upgrade was underway.

To address these shortcomings, the team devised a method that left database writes enabled, accepting that some writes might be lost during the cutover to the green version. The first challenge was minimizing the volume of new-token writes. They found an effective lever: extending the expiration time of tokens to several hours. This meant that applications that obtained new tokens before the upgrade could continue using them without needing to refresh.

With write reduction handled, the next problem was ensuring that no revocations performed by users during the upgrade window would be lost. Their solution was a queue system built on Cloudflare Queues. Whenever a revocation event occurred, a record describing that event was written into the queue. After the database was switched over to the green version, the team could drain the queue and replay every revocation that had taken place during the period when writes would otherwise have been lost. Getting this right was critical—any mistake would inadvertently restore access to applications that users had deliberately revoked.

From an operational standpoint, the first upgrade to the latest 1.X release went smoothly with zero issues. The custom database migrations completed faster than anticipated, and there was no impact on users. The team had to perform a hard cutover to the new version because the legacy version was unable to introspect tokens issued by the newer release.

Following the cutover, they observed a spike in refresh token errors

We encountered an entirely new type of error that we had never seen before. The root cause turned out to be stricter refresh-token invalidation logic in the newer version: whenever a refresh token was reused, Hydra would invalidate the entire chain of access and refresh tokens. This created a serious problem for both Wrangler and MCP clients. Both of these clients generate a high volume of requests, and a single reused refresh token would wipe out the whole session.

We addressed this by introducing refresh-token coalescing behavior into our Worker, which directs OAuth traffic to the proper destination. This let us briefly cache the refresh-token request before it reached Hydra, so that if we detected a retry we could short-circuit the response without invalidating any tokens. As a longer-term fix, Hydra’s 2.X releases include a configurable “refresh token grace period,” which allows a refresh token to be retried for a short window without tearing down the entire chain.

Since several hours of noticeable user impact would not be acceptable, we relied on our blue-green upgrade strategy. In principle, this approach is straightforward: run the migrations against a copy of the production database, and then switch over to the new Hydra version once they finish. In practice, however, there were many more pieces that had to be coordinated:

Enable the revocation replay capture queue
Copy and restore our database to the new target environment
Perform targeted data cleanup — existing rows violated certain new constraints introduced in the newer versions, which could cause the migrations to fail
Execute simultaneous cutovers across the Hydra service and two other critical internal systems to avoid any errors
Conduct post-cutover monitoring and validation

We selected an upgrade window during which Hydra’s per-second request volume was at its lowest, so that any lost token writes would be minimal. Aside from some adjustments to timeout settings, our production migrations ran smoothly against the new database: the total runtime in production was roughly three hours. Once the migrations were done, we carefully rolled out the new version of the Hydra service, along with two additional system configuration changes to point our systems at the new SDK version.

Shortly after we switched traffic over, we noticed that a data cleanup job inside our authorization service — which depends on the Hydra consent session API — was purging OAuth policy data far too aggressively. After digging into the issue, we found a bug in one of the Hydra migrations that corrupted the state of certain valid OAuth sessions, causing the migration to flag them as invalid. This corruption of valid sessions created a mismatch between Hydra and our authorization service, which showed up as a spike in 403 errors. To address this, we performed data restorations and began working on improvements to OAuth authorization behavior so that the system no longer depends on static policy data.

Beyond the data cleanup problem, there were a handful of smaller fixes driven by specific client behaviors, which we shipped quickly.

With the Hydra version upgrade finished, OAuth traffic has remained stable, delivering better performance and reliability for our customers. It also brought production onto the same foundation that our newer OAuth APIs had already been validated against in staging, paving the way for our self-managed OAuth release on June 3.

After completing an upgrade of this scale, it is always both rewarding and instructive to review broad metrics that reflect the impact. We collected additional metrics during the database migrations and saw meaningful performance gains once the upgrade was complete.

Metric	Approx. Value
Rows updated	132.5M
Rows inserted	114.7M
Temp bytes	136.97GB
Transaction commits	22.2k

Metric (avg)	Before	After	Change
API P95	185ms	101ms	-45%
RSS memory	888MB	763MB	-14%
Go heap allocation	449 MB	271 MB	-40%
Goroutines	4,015	3,076	-23%
CPU Usage	1.07 cores	0.67 cores	-37%

Self-managed OAuth for Everyone

Extending OAuth access to all customers marks a significant milestone in building a richer Cloudflare application ecosystem. Every Cloudflare user can now develop custom OAuth apps and construct integrations directly on Cloudflare. We’re thrilled to roll out self-managed OAuth across the board.

To dive in, check out our documentation or head over to the OAuth apps section in the dashboard to craft your very first OAuth app.

Top Posts

Supercharging Cloudflare’s App Universe: The Key of Universal OAuth

Supermicro Unveils Next-Gen Edge AI Platforms to Supercharge Industrial IoT

Beyond the Context Window: Rethinking Memory for AI Agent Development

Supercharging Cloudflare’s App Universe: The Key of Universal OAuth

Up in the Air: How Drone Crowds Are Reshaping the Rules of the Sky

“From Conscious Design to Engineered Accessibility: Transforming Open Source”

Post-Quantum EO: A Milestone Achieved, The Real Work Begins

Army Corps Restructuring of Value Engineering Program Ignites Fierce Backlash

Rewriting Jaeger’s ClickHouse backend: Achieving 8.6× compression on 10 million spans

OWL’s AWS Digest: Hanoi Local Zones, Grok 4.3 on Bedrock, NY Summit Highlights & Fresh Price Drops (June 22, 2026)

Supercharging Cloudflare’s App Universe: The Key of Universal OAuth

Supermicro Unveils Next-Gen Edge AI Platforms to Supercharge Industrial IoT

Beyond the Context Window: Rethinking Memory for AI Agent Development

The Condorcet Paradox: When Fair Voting Breaks Blockchain Consensus

Critical FFmpeg Vulnerability Puts Media Servers at Risk of Remote Code Execution

Gradium Launches stt-translate and s2s-translate, Real-Time Speech Translation Models Beating gpt-realtime-translate on Accuracy and Latency

Up in the Air: How Drone Crowds Are Reshaping the Rules of the Sky

Ultra-Low-Power, Tamper-Proof Positioning: How Qualinx’s QLX3Gx Chip Harnesses Galileo OSNMA for Authenticated GNSS

Trending

Supercharging Cloudflare’s App Universe: The Key of Universal OAuth

Supermicro Unveils Next-Gen Edge AI Platforms to Supercharge Industrial IoT

Latest Posts

Not More Data, but Better World Models – Unite.AI

OpenAI Is Hiring Head of Preparedness, Amid AI Cyberattack Fears

Subscribe to Updates

Top Posts

Supercharging Cloudflare’s App Universe: The Key of Universal OAuth

Scaling the ecosystem securely

Planning the upgrade to our OAuth engine

Self-managed OAuth for Everyone

Related Posts