A number of months in the past, Cloudflare introduced the transition to FL2, our Rust-based rewrite of Cloudflare’s core request dealing with layer. This transition accelerates our skill to assist construct a greater Web for everybody. With the migration within the software program stack, Cloudflare has refreshed our server {hardware} design with improved {hardware} capabilities and higher effectivity to serve the evolving calls for of our community and software program stack. Gen 13 is designed with 192-core AMD EPYC™ Turin 9965 processor, 768 GB of DDR5-6400 reminiscence, 24 TB of PCIe 5.0 NVMe storage, and twin 100 GbE port community interface card.
Gen 13 delivers:
As much as 2x throughput in comparison with Gen 12 whereas staying inside latency SLA
As much as 50% enchancment in efficiency / watt effectivity, decreasing knowledge middle enlargement prices
As much as 60% larger throughput per rack preserving rack energy price range fixed
2x reminiscence capability, 1.5x storage capability, 4x community bandwidth
Launched PCIe encryption {hardware} assist along with reminiscence encryption
Improved assist for thermally demanding highly effective drop-in PCIe accelerators
This weblog submit covers the engineering rationale behind every main part choice: what we evaluated, what we selected, and why.
Technology | Gen 13 Compute | Earlier Gen 12 Compute |
Kind Issue | 2U1N, Single socket | 2U1N, Single socket |
Processor | AMD EPYC™ 9965 | AMD EPYC™ 9684X |
Reminiscence | 768GB of DDR5-6400 x12 reminiscence channel | 384GB of DDR5-4800 x12 reminiscence channel |
Storage | x3 E1.S NVMe Samsung PM9D3a 7.68TB / | x2 E1.S NVMe Samsung PM9A3 7.68TB / |
Community | Twin 100 GbE OCP 3.0 Intel Ethernet Community Adapter E830-CDA2 / | Twin 25 GbE OCP 3.0 Intel Ethernet Community Adapter E810-XXVDA2 / |
System Administration | DC-SCM 2.0 ASPEED AST2600 (BMC) + AST1060 (HRoT) | DC-SCM 2.0 ASPEED AST2600 (BMC) + AST1060 (HRoT) |
Energy Provide | 1300W, Titanium Grade | 800W, Titanium Grade |
Determine: Gen 13 server
Gen 12 | AMD EPYC™ 9684X Genoa-X 96-Core (400W TDP, 1152 MB L3 Cache) |
Gen 13 | AMD EPYC™ 9965 Turin Dense 192-Core (500W TDP, 384 MB L3 Cache) |
Through the design part, we evaluated a number of fifth technology AMD EPYC™ Processors, code-named Turin, in Cloudflare’s {hardware} lab: AMD Turin 9755, AMD Turin 9845, and AMD Turin 9965. The desk under summarizes the variations in specs of the candidates for Gen 13 servers towards the AMD Genoa-X 9684X utilized in our Gen 12 servers. Notably, all three candidates provide will increase in core rely however with smaller L3 cache per core. Nevertheless, with the migration to FL2, the brand new workloads are much less depending on L3 cache and scale up effectively with the elevated core rely to attain as much as 100% enhance in throughput.
The three CPU candidates are designed to focus on completely different use circumstances: AMD Turin 9755 presents superior per-core efficiency, AMD Turin 9965 trades per-core efficiency for effectivity, and AMD Turin 9845 trades core rely for decrease socket energy. We evaluated three CPUs within the manufacturing setting.
CPU Mannequin | AMD Genoa-X 9684X | AMD Turin 9755 | AMD Turin 9845 | AMD Turin 9965 |
For server platform | Gen 12 | Gen 13 candidate | Gen 13 candidate | Gen 13 candidate |
# of CPU Cores | 96 | 128 | 160 | 192 |
# of Threads | 192 | 256 | 320 | 384 |
Base Clock | 2.4 GHz | 2.7 GHz | 2.1 GHz | 2.25 GHz |
Max Increase Clock | 3.7 GHz | 4.1 GHz | 3.7 GHz | 3.7 GHz |
All Core Increase Clock | 3.42 GHz | 4.1 GHz | 3.25 GHz | 3.35 GHz |
Complete L3 Cache | 1152 MB | 512 MB | 320 MB | 384 MB |
L3 cache per core | 12 MB / core | 4 MB / core | 2 MB / core | 2 MB / core |
Most configurable TDP | 400W | 500W | 390W | 500W |
First, FL2 ended the L3 cache crunch.
L3 cache is the massive, last-level cache shared amongst all CPU cores on the identical compute die to retailer ceaselessly used knowledge. It bridges the hole between gradual important reminiscence exterior to the CPU, and the quick however smaller L1 and L2 cache on the CPU, decreasing the latency for the CPU to entry knowledge.
Some might discover that the 9965 has solely 2 MB of L3 cache per core, an 83.3% discount from the 12 MB per core on Gen 12’s Genoa-X 9684X. Why commerce away the very cache benefit that gave Gen 12 its edge? The reply lies in how our workloads have developed.
Cloudflare has migrated from FL1 to FL2, a whole rewrite of our request dealing with layer in Rust. With the brand new software program stack, Cloudflare’s request processing pipeline has grow to be considerably much less depending on giant L3 cache. FL2 workloads scale almost linearly with core rely, and the 9965’s 192 cores present a 2x enhance in {hardware} threads over Gen 12.
Second, efficiency per complete price of possession (TCO). Throughout manufacturing analysis, the 9965’s 192 cores delivered the very best combination requests per second of the three candidates, and its performance-per-watt scaled favorably at 500W TDP, yielding superior rack-level TCO.
Gen 12 | Gen 13 | |
Processor | AMD EPYC™ 4th Gen Genoa-X 9684X | AMD EPYC™ fifth Gen Turin 9965 |
Core rely | 96C/192T | 192C/384T |
FL throughput | Baseline | As much as +100% |
Efficiency per watt | Baseline | As much as +50% |
Third, operational simplicity. Our operational groups have a powerful desire for fewer, higher-density servers. Managing a fleet of 192-core machines means fewer nodes to provision, patch, and monitor per unit of compute delivered. This straight reduces operational overhead throughout our international community.
Lastly, they’re ahead suitable. The AMD processor structure helps DDR5-6400, PCIe Gen 5.0, CXL 2.0 Kind 3 reminiscence throughout all SKUs. AMD Turin 9965 has the very best variety of high-performing cores per socket within the {industry}, maximizing the compute density per socket, sustaining competitiveness and relevance of the platform for years to come back. By transferring to AMD Turin 9965 from AMD Genoa-X 9684X, we get longer safety assist from AMD, extending the helpful lifetime of the Gen 13 server earlier than they grow to be out of date and must be refreshed.
Gen 12 | 12x 32GB DDR5-4800 2Rx8 (384 GB complete, 4 GB/core) |
Gen 13 | 12x 64GB DDR5-6400 2Rx4 (768 GB complete, 4 GB/core) |
As a result of the AMD Turin processor has twice the core rely of the earlier technology, it calls for extra reminiscence assets, each in capability and in bandwidth, to ship throughput good points.
Maximizing bandwidth with 12 channels
The chosen AMD EPYC™ 9965 CPU helps twelve reminiscence channels, and for Gen 13, we’re populating each single one in every of them. We’ve chosen 64 GB DDR5-6400 ECC RDIMMs in a “one DIMM per channel” (1DPC) configuration.
This setup offers 614 GB/s of peak reminiscence bandwidth per socket, a 33.3% enhance in comparison with our Gen 12 server platform. By using all 12 channels, we make sure that the CPU isn’t “starved” for knowledge, even throughout probably the most memory-intensive parallel workloads.
Populating all twelve channels in a balanced configuration — equal capability per channel, with no combined configurations — is frequent greatest follow. This issues operationally: AMD Turin processors interleave throughout all reminiscence channels with the identical DIMM sort, similar reminiscence capability and similar rank configuration. Interleaving will increase reminiscence bandwidth by spreading contiguous reminiscence entry throughout all reminiscence channels within the interleave set as a substitute of sending all reminiscence entry to a single or a small subset of reminiscence channels.
The 4 GB per core “sweet spot”
Our Gen 12 servers are configured with 4GB per core. We revisited that call as we designed Gen 13.
Cloudflare launches quite a lot of new services and products each month, and every new services or products calls for an incremental quantity of reminiscence capability. These accumulate over time and will grow to be a problem of reminiscence strain, if reminiscence capability is just not sized appropriately.
Preliminary requirement thought-about a memory-to-core ratio between 4 GB and 6 GB per core. With 192 cores on the AMD Turin 9965, that interprets to a spread of 768 GB to 1152 GB. Be aware that at larger capacities, DIMM module capability granularity are sometimes 16GB increments. With 12 channels in a 1DPC configuration, our choices are 12x 48GB (576 GB), 12x 64GB (768 GB), or 12x 96GB (1152 GB).
12x 48GB = 576 GB, or 1.5 GB/thread. The reminiscence capability of this configuration is simply too low; this could starve memory-hungry workloads and violate the decrease certain.
12x 96GB = 1152 GB, or 3.0 GB/thread. This might be a 50% capability enhance per core and would additionally end in larger energy consumption and a considerable enhance in price, particularly within the present market circumstances the place reminiscence costs are 10x of what they have been a 12 months in the past.
12x 64GB = 768 GB, or 2.0 GB/thread (4 GB/core). This configuration is in keeping with our Gen 12 reminiscence to core ratio, and represents a 2x enhance in reminiscence capability per server. Holding the reminiscence capability configuration at 4 GB per core offers enough capability for workloads that scale with core rely, like our major workload, FL, and supply enough reminiscence capability headroom for future development with out overprovisioning.
FL2 makes use of reminiscence extra effectively than FL1 did: our inner measures show FL2 makes use of lower than half the CPU of FL1, and much lower than half the reminiscence. The capability freed up by the software program stack migration offers ample headroom to assist Cloudflare development for the following few years.
The choice: 12x 64GB for 768 GB complete. This maintains the confirmed 4 GB/core ratio, offers a 2x complete capability enhance over Gen 12, and stays inside the DIMM price curve candy spot.
Effectivity by way of twin rank
In Gen 12, we demonstrated that dual-rank DIMMs present measurably larger reminiscence throughput than single-rank modules, with benefits of as much as 17.8% at a 1:1 read-write ratio. Twin-rank DIMMs are sooner as a result of they permit the reminiscence controller to entry one rank whereas one other is refreshing. That very same precept carries ahead right here.
Our requirement additionally calls for about 1 GB/s of reminiscence bandwidth per {hardware} thread. With 614 GB/s of peak bandwidth throughout 384 threads, we ship 1.6 GB/s per thread, comfortably exceeding the minimal. Manufacturing evaluation has proven that Cloudflare workloads aren’t memory-bandwidth-bound, so we financial institution the headroom as margin for future workload development.
By choosing 2Rx4 DDR5 RDIMMs at most supported 6400MT/s, we guarantee we get the bottom latency and greatest efficiency from our Gen 13 platform reminiscence configuration.
Gen 12 | x2 E1.S NVMe PCIe 4.0, 16 TB complete Samsung PM9A3 7.68TB Micron 7450 Professional 7.68TB |
Gen 13 | x3 E1.S NVMe PCIe 5.0, 24 TB complete Samsung PM9D3a 7.68TB Micron 7600 Professional 7.68TB +10x U.2 NVMe PCIe 5.0 choice |
Our storage structure underwent a change in Gen 12 after we pivoted from M.2 to EDSFF E1.S. For Gen 13, we’re growing the storage capability and the bandwidth to align with the newest know-how. Now we have additionally added a entrance drive bay for flexibility so as to add as much as 10x U.2 drives to maintain tempo with Cloudflare storage product development.
Gen 13 is configured with PCIe Gen 5.0 NVMe drives. Whereas Gen 4.0 served us effectively, the transfer to Gen 5.0 ensures that our storage subsystem can serve knowledge at improved latency, and sustain with elevated storage bandwidth demand from the brand new processor.
Past the pace enhance, we’re bodily increasing the array from two to 3 NVMe drives. Our Gen 12 server platform was designed with 4 E1.S storage drive slots, however solely two slots have been populated with 8TB drives. The Gen 13 server platform makes use of the identical design with 4 E1.S storage drive slots obtainable, however with three slots populated with 8TB drives. Why add a 3rd drive? This will increase our storage capability per server from 16TB to 24TB, guaranteeing we’re increasing our international storage capability to keep up and enhance CDN cache efficiency. This helps development projections for Sturdy Objects, Containers, and Quicksilver companies, too.
Entrance drive bay to assist further drives
For Gen 13, the chassis is designed with a entrance drive bay that may assist as much as ten U.2 PCIe Gen 5.0 NVMe drives. The entrance drive bay offers the choice for Cloudflare to make use of the identical chassis throughout compute and storage platforms, in addition to the pliability to transform a compute SKU to a storage SKU when wanted.
Endurance and reliability
We designed our servers to have a 5-year operational life and require storage drives endurance to maintain 1 DWPD (Drive Writes Per Day) over the total server lifespan.
Each the Samsung PM9D3a and Micron 7600 Professional meet the 1 DWPD specification with a {hardware} over-provisioning (OP) of roughly 7%. If future workload profiles demand larger endurance, we’ve the choice to carry again further consumer capability to extend efficient OP.
NVMe 2.0 and OCP NVMe 2.0 compliance
Each the Samsung PM9D3a and Micron 7600 undertake the NVMe 2.0 specification (up from NVMe 1.4) and the OCP NVMe Cloud SSD Specification 2.0. Key enhancements embody Zoned Namespaces (ZNS) for higher write amplification administration, Easy Copy Command for intra-device knowledge motion with out crossing the PCIe bus, and enhanced Command and Function Lockdown for tighter safety controls. The OCP 2.0 spec additionally provides deeper telemetry and debug capabilities purpose-built for datacenter operations, which aligns with our emphasis on fleet-wide manageability.
The storage drives will proceed to be within the E1.S 15mm kind issue. Its high-surface-area design is important for cooling these new Gen 5.0 controllers, which may pull upwards of 25W below sustained heavy I/O. The 2U chassis offers ample airflow over the E1.S drives in addition to U.2 drive bays, a design benefit we validated in Gen 12 after we made the choice to maneuver from 1U to 2U.
Gen 12 | Twin 25 GbE port OCP 3.0 NIC Intel E810-XXVDA2 NVIDIA Mellanox ConnectX-6 Lx |
Gen 13 | Twin 100 GbE port OCP 3.0 NIC Intel E830-CDA2 NVIDIA Mellanox ConnectX-6 Dx |
For greater than eight years, twin 25 GbE was the spine of our fleet. Since 2018 it has served us effectively, however because the CPU has improved to serve extra requests and our merchandise scale, we’ve formally hit the wall. For Gen 13, we’re quadrupling our per-port bandwidth.
Community Interface Card (NIC) bandwidth should maintain tempo with compute efficiency development. With 192 fashionable cores, our 25 GbE hyperlinks will grow to be a measurable bottleneck. Manufacturing knowledge from our co-locations worldwide over per week confirmed that, on our Gen 12, P95 bandwidth per port is persistently >50% of accessible bandwidth. Since throughput is doubling per server on Gen 13, we’re susceptible to saturating the NIC bandwidth.
Determine: on Gen 12, P95 bandwidth per port is persistently >50% of accessible bandwidth
The choice to go to 100 GbE relatively than 50 GbE was pushed by {industry} economics: 50 GbE transceiver volumes stay low within the {industry}, making them a poor provide chain wager. Twin 100 GbE ports additionally give us 200 Gb/s of combination bandwidth per server, future-proofing towards the following a number of years of site visitors development.
{Hardware} selections and compatibility
We’re sustaining our dual-vendor technique to make sure provide chain resilience, a lesson hard-learned throughout the pandemic when single-sourcing the Gen 11 NIC left us scrambling.
Each NICs are compliant with OCP 3.0 SFF/TSFF kind issue with the built-in pull tab, sustaining chassis commonality with Gen 12 and guaranteeing area technicians want no new instruments or coaching for swaps.
The OCP 3.0 NIC slot is allotted PCIe 4.0 x16 lanes on the motherboard, offering 256 Gb/s of bidirectional bandwidth, greater than sufficient for twin 100 GbE (200 Gb/s combination) with room to spare.
We’re sustaining the architectural shift, launched in Gen 12, of separating administration and security-related elements from the motherboard onto the Challenge Argus Knowledge Heart Safe Management Module 2.0.
Determine: Challenge Argus DC-SCM 2.0
Continuity with DC-SCM 2.0
We’re carrying ahead the Knowledge Heart Safe Management Module 2.0 (DC-SCM 2.0) commonplace. By decoupling administration and safety features from the motherboard, we make sure that the “brains” of the server’s safety keep modular and guarded.
The DC-SCM module homes our most important elements:
Primary Enter/Output System (BIOS)
Baseboard Administration Controller (BMC)
{Hardware} Root of Belief (HRoT) and TPM (Infineon SLB 9672)
Twin BMC/BIOS flash chips for redundancy
Why we’re staying the course with DC-SCM 2.0
The choice to maintain this structure for Gen 13 is pushed by the confirmed safety good points we noticed within the earlier technology. By offloading these features to a devoted module, we keep:
Fast restoration: Twin picture redundancy permits for near-instant restoration of BIOS/UEFI and BMC firmware if an unintended corruption or a malicious replace is detected.
Bodily resilience: The Gen 13 chassis additionally strikes the intrusion detection mechanism farther from the flat fringe of the chassis, making bodily intercept more durable.
PCIe encryption: Along with TSME (Clear Safe Reminiscence Encryption) for CPU-to-memory encryption that was already enabled since our Gen 10 platforms, AMD Turin 9965 processor for Gen 13 extends encryption to PCIe site visitors, this ensures knowledge is protected in transit throughout each bus within the system.
Operational consistency: Sticking with the Gen 12 administration stack means our safety audits, deployment, provisioning, and operational commonplace process stay totally suitable.
Gen 12 | 800W 80 PLUS Titanium CRPS |
Gen 13 | 1300W 80 PLUS Titanium CRPS |
As we improve the compute and networking functionality of the server, the ability envelope of our servers has naturally expanded. Gen 13 are outfitted with larger energy provides to ship the ability wanted.
Whereas our Gen 12 nodes operated comfortably with 800W 80 PLUS Titanium CRPS (Frequent Redundant Energy Provide), the Gen 13 specification requires a bigger energy provide. Now we have chosen a 1300W 80 PLUS Titanium CRPS.
Energy consumption of Gen 13 throughout typical operation has risen to 850W, a 250W enhance over the 600W seen in Gen 12. The first contributors are the 500W TDP CPU (up from 400W), doubling of the reminiscence capability and the extra NVMe drive.
Why 1300W as a substitute of 1000W? The present PSU ecosystem lacks viable, high-efficiency choices at 1000W. To make sure provide chain reliability, we moved to the following industry-standard tier of 1300W.
EU Lot 9 is a regulation that requires servers deploying within the European Union to have energy provides with effectivity at 10%, 20%, 50% and 100% load to be at or above the share threshold specified within the regulation. The edge matches 80 PLUS Energy Provide certification program titanium grade PSU requirement. We selected a titanium grade PSU for Gen 13 to keep up full compliance with EU Lot 9, guaranteeing that the servers will be deployed in our European knowledge facilities and past.
Thermal design: 2U pays dividends once more
The 2U1N kind issue we adopted in Gen 12 continues to pay dividends. Gen 13 makes use of 5x 80mm followers (up from 4x in Gen 12) to deal with the elevated thermal load from the 500W CPU. The bigger fan quantity, mixed with the 2U chassis airflow traits, means followers function effectively under most obligation cycle at typical ambient temperatures, preserving fan energy within the < 50W vary per fan.
Drop-in accelerator assist
Gen 12 | x2 single width FHFL or x1 double width FHFL |
Gen 13 | x2 double width FHFL |
Sustaining the modularity of our fleet is a core requirement for our server design. This requirement enabled Cloudflare to rapidly retrofit and deploy GPUs globally to greater than 100 cities in 2024. In Gen 13, we’re persevering with the assist of high-performance PCIe add-in playing cards.
On Gen 13, the 2U chassis format is up to date and configured to assist extra demanding energy and thermal necessities. Whereas Gen 12 was restricted to a single double-width GPU, the Gen 13 structure now helps two double-width PCIe playing cards.
A launchpad to scale Cloudflare to better heights
Each technology of Cloudflare servers is an train in balancing competing constraints: efficiency versus energy, capability versus price, flexibility versus simplicity. Gen 13 comes with 2x core rely, 2x reminiscence capability, 4x community bandwidth, 1.5x storage capability, and future-proofing for accelerator deployments — all whereas enhancing complete price of possession and sustaining a strong administration characteristic set and safety posture that our international fleet calls for.
Gen 13 servers are totally certified and shall be deployed to serve hundreds of thousands of requests throughout Cloudflare’s international community in additional than 330 cities. As all the time, Cloudflare’s journey to serve the Web as effectively as doable doesn’t finish right here. Because the deployment of Gen 13 begins, we’re planning the structure for Gen 14.
If you’re enthusiastic about serving to construct a greater Web, come be part of us. We’re hiring.



