ecdysis | ˈekdəsəs |
noun
the method of shedding the outdated pores and skin (in reptiles) or doing away with the outer
cuticle (in bugs and different arthropods).
How do you improve a community service, dealing with hundreds of thousands of requests per second across the globe, with out disrupting even a single connection?
One among our options at Cloudflare to this huge problem has lengthy been ecdysis, a Rust library that implements sleek course of restarts the place no reside connections are dropped, and no new connections are refused.
Final month, we open-sourced ecdysis, so now anybody can use it. After 5 years of manufacturing use at Cloudflare, ecdysis has confirmed itself by enabling zero-downtime upgrades throughout our vital Rust infrastructure, saving hundreds of thousands of requests with each restart throughout Cloudflare’s international community.
It’s exhausting to overstate the significance of getting these upgrades proper, particularly on the scale of Cloudflare’s community. Lots of our companies carry out vital duties resembling visitors routing, TLS lifecycle administration, or firewall guidelines enforcement, and should function repeatedly. If one among these companies goes down, even for an on the spot, the cascading influence might be catastrophic. Dropped connections and failed requests shortly result in degraded buyer efficiency and enterprise influence.
When these companies want updates, safety patches can’t wait. Bug fixes want deployment and new options should roll out.
The naive strategy includes ready for the outdated course of to be stopped earlier than spinning up the brand new one, however this creates a window of time the place connections are refused and requests are dropped. For a service dealing with hundreds of requests per second in a single location, multiply that throughout a whole bunch of information facilities, and a short restart turns into hundreds of thousands of failed requests globally.
Let’s dig into the issue, and the way ecdysis has been the answer for us — and possibly might be for you.
Hyperlinks: GitHub | crates.io | docs.rs
Why sleek restarts are exhausting
The naive strategy to restarting a service, as we talked about, is to cease the outdated course of and begin a brand new one. This works acceptably for easy companies that don’t deal with real-time requests, however for community companies processing reside connections, this strategy has vital limitations.
First, the naive strategy creates a window throughout which no course of is listening for incoming connections. When the outdated course of stops, it closes its listening sockets, which causes the OS to instantly refuse new connections with ECONNREFUSED. Even when the brand new course of begins instantly, there’ll all the time be a niche the place nothing is accepting connections, whether or not milliseconds or seconds. For a service dealing with hundreds of requests per second, even a niche of 100ms means a whole bunch of dropped connections.
Second, stopping the outdated course of kills all already-established connections. A consumer importing a big file or streaming video will get abruptly disconnected. Lengthy-lived connections like WebSockets or gRPC streams are terminated mid-operation. From the consumer’s perspective, the service merely vanishes.
Binding the brand new course of earlier than shutting down the outdated one seems to resolve this, but additionally introduces extra points. The kernel usually permits just one course of to bind to an tackle:port mixture, however the SO_REUSEPORT socket choice permits a number of binds. Nonetheless, this creates an issue throughout course of transitions that makes it unsuitable for sleek restarts.
When SO_REUSEPORT is used, the kernel creates separate listening sockets for every course of and load balances new connections throughout these sockets. When the preliminary SYN packet for a connection is acquired, the kernel will assign it to one of many listening processes. As soon as the preliminary handshake is accomplished, the connection then sits within the settle for() queue of the method till the method accepts it. If the method then exits earlier than accepting this connection, it turns into orphaned and is terminated by the kernel. GitHub’s engineering crew documented this problem extensively when constructing their GLB Director load balancer.
After we got down to design and construct ecdysis, we recognized 4 key objectives for the library:
Previous code might be fully shut down post-upgrade.
The brand new course of has a grace interval for initialization.
New code crashing throughout initialization is suitable and shouldn’t have an effect on the operating service.
Solely a single improve runs in parallel to keep away from cascading failures.
ecdysis satisfies these necessities following an strategy pioneered by NGINX, which has supported sleek upgrades since its early days. The strategy is simple:
The mum or dad course of
fork()s a brand new baby course of.The kid course of replaces itself with a brand new model of the code with
execve().The kid course of inherits the socket file descriptors through a named pipe shared with the mum or dad.
The mum or dad course of waits for the kid course of to sign readiness earlier than shutting down.
Crucially, the socket stays open all through the transition. The kid course of inherits the listening socket from the mum or dad as a file descriptor shared through a named pipe. In the course of the kid’s initialization, each processes share the identical underlying kernel knowledge construction, permitting the mum or dad to proceed accepting and processing new and current connections. As soon as the kid completes initialization, it notifies the mum or dad and begins accepting connections. Upon receiving this prepared notification, the mum or dad instantly closes its copy of the listening socket and continues dealing with solely current connections.
This course of eliminates protection gaps whereas offering the kid a secure initialization window. There’s a temporary window of time when each the mum or dad and baby might settle for connections concurrently. That is intentional; any connections accepted by the mum or dad are merely dealt with till completion as a part of the draining course of.
This mannequin additionally offers the required crash security. If the kid course of fails throughout initialization (e.g., because of a configuration error), it merely exits. Because the mum or dad by no means stopped listening, no connections are dropped, and the improve might be retried as soon as the issue is mounted.
ecdysis implements the forking mannequin with first-class help for asynchronous programming by means of Tokio and systemd integration:
Tokio integration: Native async stream wrappers for Tokio. Inherited sockets turn out to be listeners with out extra glue code. For synchronous companies, ecdysis helps operation with out async runtime necessities.
systemd-notify help: When the
systemd_notifycharacteristic is enabled, ecdysis routinely integrates with systemd’s course of lifecycle notifications. SettingSort=notify-reloadin your service unit file permits systemd to trace upgrades appropriately.systemd named sockets: The
systemd_socketscharacteristic permits ecdysis to handle systemd-activated sockets. Your service might be socket-activated and help sleek restarts concurrently.
Platform word: ecdysis depends on Unix-specific syscalls for socket inheritance and course of administration. It doesn’t work on Home windows. This can be a basic limitation of the forking strategy.
Sleek restarts introduce safety issues. The forking mannequin creates a short window the place two course of generations coexist, each with entry to the identical listening sockets and probably delicate file descriptors.
ecdysis addresses these considerations by means of its design:
Fork-then-exec: ecdysis follows the standard Unix sample of fork() adopted instantly by execve(). This ensures the kid course of begins with a clear slate: new tackle house, recent code, and no inherited reminiscence. Solely explicitly-passed file descriptors cross the boundary.
Express inheritance: Solely listening sockets and communication pipes are inherited. Different file descriptors are closed through CLOEXEC flags. This prevents unintended leakage of delicate handles.
seccomp compatibility: Providers utilizing seccomp filters should permit fork() and execve(). This can be a tradeoff: sleek restarts require these syscalls, in order that they can’t be blocked.
For many community companies, these tradeoffs are acceptable. The safety of the fork-exec mannequin is nicely understood and has been battle-tested for many years in software program like NGINX and Apache.
Let’s take a look at a sensible instance. Right here’s a simplified TCP echo server that helps sleek restarts:
use ecdysis::tokio_ecdysis::{SignalKind, StopOnShutdown, TokioEcdysisBuilder};
use tokio::{internet::TcpStream, job::JoinSet};
use futures::StreamExt;
use std::internet::SocketAddr;
#[tokio::main]
async fn predominant() {
// Create the ecdysis builder
let mut ecdysis_builder = TokioEcdysisBuilder::new(
SignalKind::hangup() // Set off improve/reload on SIGHUP
).unwrap();
// Set off cease on SIGUSR1
ecdysis_builder
.stop_on_signal(SignalKind::user_defined1())
.unwrap();
// Create listening socket - might be inherited by youngsters
let addr: SocketAddr = "0.0.0.0:8080".parse().unwrap();
let stream = ecdysis_builder
.build_listen_tcp(StopOnShutdown::Sure, addr, |builder, addr| {
builder.set_reuse_address(true)?;
builder.bind(&addr.into())?;
builder.hear(128)?;
Okay(builder.into())
})
.unwrap();
// Spawn job to deal with connections
let server_handle = tokio::spawn(async transfer {
let mut stream = stream;
let mut set = JoinSet::new();
whereas let Some(Okay(socket)) = stream.subsequent().await {
set.spawn(handle_connection(socket));
}
set.join_all().await;
});
// Sign readiness and watch for shutdown
let (_ecdysis, shutdown_fut) = ecdysis_builder.prepared().unwrap();
let shutdown_reason = shutdown_fut.await;
log::information!("Shutting down: {:?}", shutdown_reason);
// Gracefully drain connections
server_handle.await.unwrap();
}
async fn handle_connection(mut socket: TcpStream) {
// Echo connection logic right here
}The important thing factors:
build_listen_tcpcreates a listener that might be inherited by baby processes.prepared()alerts to the mum or dad course of that initialization is full and that it may possibly safely exit.shutdown_fut.awaitblocks till an improve or cease is requested. This future solely yields as soon as the method must be shut down, both as a result of an improve/reload was executed efficiently or as a result of a shutdown sign was acquired.
Once you ship SIGHUP to this course of, right here’s what ecdysis does…
…on the mum or dad course of:
Forks and execs a brand new occasion of your binary.
Passes the listening socket to the kid.
Waits for the kid to name
prepared().Drains current connections, then exits.
…on the kid course of:
Initializes itself following the identical execution move because the mum or dad, besides any sockets owned by ecdysis are inherited and never certain by the kid.
Indicators readiness to the mum or dad by calling
prepared().Blocks ready for a shutdown or improve sign.
ecdysis has been operating in manufacturing at Cloudflare since 2021. It powers vital Rust infrastructure companies deployed throughout 330+ knowledge facilities in 120+ international locations. These companies deal with billions of requests per day and require frequent updates for safety patches, characteristic releases, and configuration adjustments.
Each restart utilizing ecdysis saves a whole bunch of hundreds of requests that may in any other case be dropped throughout a naive cease/begin cycle. Throughout our international footprint, this interprets to hundreds of thousands of preserved connections and improved reliability for purchasers.
Sleek restart libraries exist for a number of ecosystems. Understanding when to make use of ecdysis versus alternate options is vital to selecting the best software.
tableflip is our Go library that impressed ecdysis. It implements the identical fork-and-inherit mannequin for Go companies. If you happen to want Go, tableflip is a superb choice!
shellflip is Cloudflare’s different Rust sleek restart library, designed particularly for Oxy, our Rust-based proxy. shellflip is extra opinionated: it assumes systemd and Tokio, and focuses on transferring arbitrary software state between mum or dad and baby. This makes it glorious for complicated stateful companies, or companies that need to apply such aggressive sandboxing that they’ll’t even open their very own sockets, however provides overhead for easier circumstances.
ecdysis brings 5 years of production-hardened sleek restart capabilities to the Rust ecosystem. It’s the identical know-how defending hundreds of thousands of connections throughout Cloudflare’s international community, now open-sourced and out there for anybody!
Full documentation is accessible at docs.rs/ecdysis, together with API reference, examples for widespread use circumstances, and steps for integrating with systemd.
The examples listing within the repository incorporates working code demonstrating TCP listeners, Unix socket listeners, and systemd integration.
The library is actively maintained by the Argo Good Routing & Orpheus crew, with contributions from groups throughout Cloudflare. We welcome contributions, bug stories, and have requests on GitHub.
Whether or not you’re constructing a high-performance proxy, a long-lived API server, or any community service the place uptime issues, ecdysis can present a basis for zero-downtime operations.
Begin constructing: github.com/cloudflare/ecdysis



