Chromatix: A Differentiable, GPU-Accelerated Wave-Optics Library

Design and Implementation

Chromatix’s architecture draws significant inspiration from modern deep learning frameworks. In this section, we outline the core principles guiding our design and implementation. We believe that effective computational optics frameworks—much like their deep learning counterparts—must prioritize three essential features: differentiability, composability, and scalability. We also describe how Chromatix’s high-level implementation reflects these principles.

Differentiability

Differentiability refers to the ability to compute gradients, which are essential for gradient-based optimization—such as tuning parameters in an optical simulation. For simple, low-dimensional inputs, numerical differentiation may suffice. However, for high-dimensional data like spatial light modulator (SLM) pixel arrays, this approach becomes prohibitively slow. Instead, backpropagation—efficiently propagating gradients through each simulation step—is preferred. When paired with diverse optical models or neural networks, automatic differentiation becomes highly valuable.

Languages like MATLAB and C lack built-in automatic differentiation, forcing developers to manually derive and code gradients—a tedious, error-prone, and inflexible process. Modern deep learning frameworks^24,25,26, by contrast, automatically compute gradients for any differentiable function with respect to its parameters. Chromatix leverages JAX to offer the same capability: once a simulation is defined, its gradients are automatically available.

Differentiability has already enabled end-to-end design of computational optics systems across numerous applications^{3,9,18,19,20,21,22,23}. It also enhances solutions to inverse optical problems. Traditional methods often oversimplify both the sample and optical setup^33,34,35,36, whereas differentiable models can incorporate realistic complexity. Automatic differentiation unlocks advanced optimizers like Adam³⁷, enabling high-fidelity reconstructions even when forward models include complex physics such as scattering^16,38 or sample deformation³⁹. Crucially, this comes at almost no extra coding effort: defining the forward model automatically defines its gradients. This also enables self-calibrating algorithms⁴⁰, where physical parameters (e.g., illumination angle in tomography) are jointly optimized with the sample to improve accuracy^41,42.

Another emerging approach replaces discrete voxel-based representations with continuous neural network-based models—known as implicit neural representations (INRs) or neural radiance fields^39,43,44,45. INRs have been used to separate motion artifacts from sample dynamics³⁹, estimate dynamic aberrations⁴⁴, reconstruct 3D quantitative phase in scattering samples⁴⁶, and correct aberrations without wavefront sensors or calibration⁴³. All of these rely on differentiable simulations for training.

Composability

Differentiability naturally supports composability: the gradient of a composite function can be derived from the gradients of its parts. More broadly, composability means being able to easily swap or replace components—such as changing an activation function in a neural network—without rewriting the entire system. This is possible thanks to standardization in machine learning, allowing researchers to rapidly integrate others’ work by simply plugging in new functions.

Optics, however, lags behind. Implementations are often custom-built for specific projects, with inconsistent conventions and no shared benchmarks for accuracy or speed. This leads to wasted effort, increased errors, and difficulty reproducing results. Chromatix addresses this by proposing a standardized framework for wave-optics simulations, enabling seamless integration of diverse optical models (Fig. 1). All experiments in this paper use a shared, rigorously tested codebase, and many additional components are documented. We believe that providing both a standard library and reference implementations can significantly accelerate research and improve reproducibility.

Fig. 1: The design and components of Chromatix.

a, Chromatix integrates wave-optics modeling, GPU acceleration, and differentiability into a single library, offering a unified framework for diverse applications. b, It supports a broad range of optical elements—including lenses, sensors, scalar and vectorial free-space propagation models^70,71,72, and complex scattering samples^16,38. c, These components can be flexibly combined to simulate various experimental setups and tackle a wide array of computational optics challenges. Green-highlighted elements indicate which component or sample is optimized in each use case. DMD: digital micromirror device; SLM: spatial light modulator; f: focal length; z_n: propagation distance.

Scalability

Optical systems are increasingly demanding larger fields of view (FOVs) and higher resolutions, often requiring large compute clusters for sample reconstruction. Thus, scalability is a critical requirement for modern optics simulation tools. Researchers need the flexibility to prototype quickly on a laptop and then seamlessly scale up to GPU clusters for large-scale reconstructions.

Previous tools have made this difficult: NumPy runs only on CPUs⁴⁷; MATLAB requires specialized code for GPU support and lacks general-purpose automatic differentiation; PyTorch²⁴ and TensorFlow²⁵ simplify GPU programming with automatic differentiation but struggle with multi-GPU support and perform poorly on optical simulation tasks that differ significantly from typical neural network operations. Writing device-specific code demands substantial effort and locks in limitations,

This approach falls short when it comes to the rapid iteration cycles required by modern scientific research. Chromatix addresses this by leveraging JAX²⁶ and its built-in XLA (Accelerated Linear Algebra) compiler, enabling high-speed optical simulations across CPUs, GPUs, and tensor processing units—all from a single codebase. Unlike other frameworks, there’s no need to write specialized low-level GPU code to achieve performance gains, as is often necessary with tools like PyTorch²⁴. Additionally, JAX provides built-in utilities for automatically vectorizing computations—such as processing an entire batch on one GPU—or distributing work across multiple GPUs, regardless of how the optical components in a system are described²⁶. For instance, with just a few minor code adjustments, a basic 2D, single-wavelength simulation can be expanded into a full 3D, multi-wavelength simulation that runs efficiently across several GPUs simultaneously (see Extended Data Fig. 1 for concrete examples).

Implementation

In deep learning, models are typically built as chains of operations called layers. Optics follows a similar pattern: optical systems are made up of a series of components and light propagation steps. However, there’s a crucial distinction—the “state” of an optical system isn’t abstract; it corresponds directly to the physical complex light field traveling through the setup. To fully capture this field—and thus the complete state of the system at any moment—you also need details like wavelength, polarization, and spatial sampling resolution. The foundational concept in Chromatix is that all these pieces of information can be unified within a single, core data structure. Each optical element then becomes a transformation applied to this structured field, and any full optical system is simply a sequence of such transformations. This design allows Chromatix to handle a broad range of optical setups through one consistent interface, making it easy to extend and adapt.

Experiments

We conducted six computational experiments to highlight four key strengths of Chromatix: its ability to solve inverse problems for sample reconstruction, accelerate both reconstruction and optical design via deep learning, flexibly combine modular optical elements and models, and boost simulation speed by up to tenfold. These demonstrations include reimplementations of established wave-based computational optics methods, as well as novel in silico solutions to inverse problems involving previously underexplored combinations of optical effects.

Inverse problems for reconstructing samples

Microscope aberrations are often treated as uniform across the field of view (FOV), mainly because modeling and measuring position-dependent point spread functions (PSFs) is computationally and experimentally demanding. However, one research group^48,49 pointed out that many imaging systems exhibit rotational symmetry due to the symmetric nature of common optical components. As a result, most aberrations change only with radial distance from the center of the FOV, up to a rotation. Characterizing aberration variation along this radial profile drastically reduces calibration effort and computational cost—scaling linearly rather than quadratically with camera sensor height. This efficiency makes it feasible to perform deconvolution that accounts for spatially varying aberrations. The team introduced “ring deconvolution microscopy”⁴⁸, a method that exploits rotational invariance in standard microscopes using incoherent illumination (e.g., fluorescence or brightfield with incoherent light) to model spatially varying PSFs efficiently. We implemented this technique in Chromatix (Fig. 2b) using data from a UCLA Miniscope^50,51—a compact widefield microscope. The system is modeled as a 4f optical setup with rotationally symmetric Seidel aberrations placed in the Fourier plane. After calibrating Seidel coefficients from reference images⁴⁸, we deconvolved the captured sample image using this rotationally invariant yet spatially varying PSF model.

Fig. 2: Chromatix solves inverse problems in multiple types of sample and sample representation. — **Fig. 2: Chromatix solves inverse problems across diverse sample types and representations.**

We present, respectively, the measured

Here’s a rephrased, clearer version of your content while preserving all HTML structure exactly:

An image of a rabbit liver with inconsistent illumination is shown alongside three reconstructions: one using Chromatix’s ring deconvolution—which restores sharpness across the entire field of view (FOV), another using standard deconvolution—which only improves clarity at the FOV center—and a third from the original ring deconvolution method^48,49 (Fig. 2c–f and Extended Data Fig. 2). Unlike the earlier approach, our version supports a significantly larger FOV by distributing computations across multiple GPUs. The original method fails to reconstruct the full camera FOV without either slowing down dramatically or exceeding the memory limits of a single GPU (e.g., 48 GB on an RTX 8000 or 80 GB on an H100). Chromatix is also much faster: it runs 4.5× quicker than the original PyTorch code on one GPU and nearly 19× faster when using 8 GPUs (see below).

Beyond voxel grids, Chromatix supports implicit neural representations (INRs), which can simplify the optimization process in certain cases⁴⁴. For example, CoCoA (Coordinate-based Neural Representations for Computational Adaptive Optics)⁴³ simultaneously reconstructs both the sample (as an INR) and optical aberrations (as Zernike coefficients) without requiring labeled training data. Chromatix’s implementation of CoCoA is illustrated in Fig. 2g,h. In this scenario, aberrations are assumed to be uniform across space, and the sample emits incoherent fluorescence rather than being lit by incoherent transmitted light. We show a maximum-intensity projection of a 4-µm-thick slice from a widefield image of mouse neurons, followed by reconstructions from the original CoCoA method and from Chromatix (Fig. 2i–k). The Chromatix result preserves smoother, more continuous dendrites, whereas the original produces fragmented, spotty structures. Additionally, Chromatix completes the reconstruction twice as fast on a single GPU and nearly 9× faster with 8 GPUs. On a fluorescent bead dataset with known, deliberately introduced aberrations⁴³, Chromatix recovers the Zernike mode coefficients with a root-mean-square error of 3.56 nm—much better than the original method’s 6.97 nm (r.m.s. calculated over the three nonzero Zernike modes used; see Extended Data Fig. 3).

While increasing the number of INR layers might reduce detail loss in the original method⁴³, here we compare both approaches using identical network sizes. Chromatix uses a fully paraxial model for the field at the pupil plane, whereas the original implementation combines an exact (but high-sample-rate) pupil model with a paraxial approximation of the second lens in the 4f setup. This hybrid approach risks aliasing if undersampled. Chromatix’s consistent paraxial approximation leads to superior dendrite reconstruction (Fig. 2i–k), demonstrating not only higher speed but also the advantage of a unified, well-tested modeling framework like Chromatix when avoiding mismatches between simulation and reality.

Researchers in ref. ¹⁶ demonstrated that computational imaging can quantitatively map the 3D refractive index of highly scattering samples—beyond what traditional widefield microscopy allows—using intensity measurements alone. Their sample was a D. rerio embryo tail at 24 hours post-fertilization (hpf), illuminated coherently at multiple angles¹⁷. The refractive index (Fig. 2r) was inferred by matching real measurements (Fig. 2q) to a differentiable simulation (Fig. 2p) of light propagation through the sample. Because such samples scatter light many times, they require a multislice forward model⁵². Although we use the exact same physical model as the original MATLAB code¹⁷, our Chromatix implementation is 3–13× faster on comparable volumes—cutting reconstruction from hours to minutes. It’s also far more concise (~25 lines vs. ~107 in the original^16,17) and more adaptable thanks to automatic differentiation. This speed boost lets us use better reconstruction settings, eliminating the large grid artifacts seen in the original result (Fig. 2r) in favor of a cleaner output (Fig. 2s).

Programmable optics and deep learning

Spatial light modulators (SLMs) now offer precise control over light via millions of individually addressable pixels. While commonly used for holography, this capability also allows engineers to design custom point-spread functions (PSFs) tailored to specific tasks⁵³, such as capturing 3D fluorescent volumes in a single snapshot. Optimizing such complex, high-dimensional optical systems demands gradient-based methods and benefits greatly from GPU acceleration. One team³ developed a deep learning approach that integrates a reconfigurable microscope (the Holoscope) with a neural network to reconstruct 3D fluorescence volumes from single 2D snapshots (Fig. 3a). In this system, both the neural network weights and the SLM’s phase mask pixels are co-optimized using a differentiable model of the microscope. The PSF effectively compresses 3D information into a 2D image, which is then decoded by a FourierNet architecture that leverages structural priors of the sample. This enables rapid, accurate 3D reconstruction that combines hardware and algorithmic intelligence.

**Fig. 3: Chromatix enables deep learning for optical design and acceleration of inverse problems.**

This microscope design can therefore be programmed to function as a snapshot microscope optimized for various sample types, while using exactly the same hardware. The microscope is modeled using a 4f system with an SLM (phase mask) in the Fourier plane and is optimized for whole-brain imaging of fluorescently labeled D. rerio larvae. The PSF of the 4f system is simulated with coherent propagation, and the image is simulated as the incoherent sum of these PSFs that is efficiently implemented as a convolution of the PSF and the sample intensity. We show the learned PSF (Fig. 3b), the simulated 2D measurement of a virtual zebrafish volume (Fig. 3c) and the ground truth volume and simulated reconstruction³ (Fig. 3d,e). Chromatix reproduces the original results³ nearly exactly: on a test set of 10 volumes and their simulated images, reconstruction networks trained with identical PSFs offer a structure similarity index measure on a test set of 10 volumes of 0.979 ± 0.003 (mean ± standard error; higher is better) for both Chromatix and the original implementation³ (not significantly different at P = 0.695 via two-sided t-test, Extended Data Fig. 4). Chromatix also outperforms the original implementation³ in training speed by a factor of approximately 7× (Fig. 5). Practically, this reduces the optimization time for a single PSF from weeks to days.

SLMs also enable computer-generated holography systems for optogenetics, where 3D holographic stimulation patterns are used to perturb neural activity in the brain. Most holography systems rely on some form of iterative optimization (for example, refs. ^33,34,54,55) to find the phase to display on the SLM. While this produces accurate solutions, iterating does become problematic when speed is paramount. For optogenetics, point cloud holography can be used to stimulate multiple neurons without iterative optimization of phase patterns, but this only allows for placing copies of a single pattern at the desired locations¹⁵. Due to the interest in holography for displays, fast holography algorithms for arbitrary patterns have emerged that use neural networks to quickly generate a hologram given a target pattern^42,56. Applied to optogenetics, DeepCGH¹² also demonstrates fast computer-generation of holograms by training a neural network to generate phase patterns from intensity images of arbitrary 3D patterns in a single feedforward inference step. We implemented DeepCGH¹² (Fig. 3f). We show the desired target patterns and resulting simulated pattern at three different depth planes using the phase pattern produced by the DeepCGH method in Chromatix (Fig. 3g–l). We achieve nearly identical results to the original TensorFlow implementation: on a test set of 16 target patterns, Chromatix achieves a structure similarity index measure of 0.985 ± 0.001 (mean ± standard error; higher is better) versus 0.982 ± 0.001 for the original implementation (significantly different at P = 0.018 < 0.05 via two-sided t-test, Extended Data Fig. 5) and peak signal to noise ratio of 35.40 ± 0.37 (mean ± standard error; higher is better) for Chromatix versus 34.95 ± 0.16 for the original implementation¹² (not significantly different at P = 0.177 via two-sided t-test). Our implementation is approximately 17 lines of code for a differentiable hologram simulation versus 33 lines in the original work¹². While achieving the same quality, Chromatix provides a 2.5× performance improvement on a single GPU, which increases to over 10× when using 8 GPUs in parallel (Fig. 5).

Flexible modeling with optical building blocks

Because Chromatix models are constructed from components that can be flexibly combined (Fig. 1), we can straightforwardly construct complex optical models and also optimize them with arbitrary objective functions. We show another programmable microscope modeled as a 4f system with an SLM in the Fourier plane, followed by a neural network-based reconstruction step (Fig. 4a–f). The objective in this demonstration is to optimize the PSF of this programmable microscope to perform spectroscopic single-molecule localization⁵⁷ from a single snapshot image: that is, to reconstruct multicolor point sources using only a single-channel image. The simulated samples consist of several point sources incoherently emitting fluorescence at 25 wavelengths from 400 nm to 650 nm that are simulated in parallel using Chromatix. We train the neural network to reconstruct the multicolor sample at the corresponding 10-nm intervals, giving us a hyperspectral cube from a single-channel 2D measurement. The optimized PSF allows visual classification of the color of these point sources on a monochrome simulated camera image (Fig. 4c,d) by taking advantage of different fringe patterns for different wavelengths. The reconstruction (Fig. 4f) reasonably matches the true colors of the points in the sample (Fig. 4e). We highlight that here we are optimizing the same programmable microscope model that was used for snapshot microscopy (Fig. 3a), but for an entirely new combination of sample type and objective.

**Fig. 4: Chromatix enables arbitrary combinations of optical models.**

Combining various wave-optics models in arbitrary configurations can expand the utility of differentiable simulations in biological research. In optogenetics, researchers use intricate 3D light patterns—often generated via holography—to precisely control neuronal activity. However, achieving accurate 3D holographic control is already difficult, and the challenge intensifies in biological tissue due to light scattering. As light travels through tissue, it scatters, potentially stimulating unintended neurons and raising the risk of phototoxicity. Here, we demonstrate how Chromatix can optimize holographic light patterns in highly scattering tissue by capturing the scattered output (Fig. 4g–m). Our simulation models a plane wave striking a phase mask (SLM), being focused by a thin lens, and then propagating through a scattering volume using the multislice beam propagation method—the same approach used for the scattering sample in Fig. 2p. This allows us to observe intensity distribution throughout the entire volume. Without correction, scattering leads to non-uniform stimulation (Fig. 4h). By incorporating the measured scattered intensity as feedback into the optimization loop, we achieve nearly uniform stimulation (blue line in Fig. 4h) across the full axial range. This in silico experiment illustrates how Chromatix empowers researchers to quickly prototype and refine their concepts, turning theoretical insights into practical outcomes.

High performance through parallelization

To showcase Chromatix’s computational efficiency and scalability, we measured iteration speeds across all training and optimization tasks discussed. Chromatix outperforms all prior optical methods, delivering 2–6× speedups on a single GPU and up to 22× faster performance on 8 GPUs in the best-case scenario (Fig. 5). Gains on a single GPU stem largely from reduced overhead thanks to JAX compilation, compared to implementations in MATLAB, PyTorch, or TensorFlow. More dramatically, Chromatix enables order-of-magnitude accelerations through parallelization—achieved with minimal code changes, thanks to its native integration with JAX. This scalability allows Chromatix to tackle large-scale problems and makes existing inverse problems far more tractable, as seen in the expanded field of view in ring deconvolution microscopy (Fig. 2a–f), the drastically reduced optimization time in refractive index microscopy (Fig. 2o–u), and faster snapshot PSF optimization using deep learning (Fig. 3a–e).

**Fig. 5: Chromatix is the fastest implementation of existing computational optics methods.**

Top Posts

VA EHR Expansion Accelerates: Four New Deployments Signal Nationwide Digital Health Push

Decades of Remote Work: The 7 Laptop-Bag Essentials I Never Leave Home Without

AI Agents Outpace Traditional Search by 48x in Groundbreaking Harvard-Perplexity Study

Chromatix: A Differentiable, GPU-Accelerated Wave-Optics Library

Xiaomi MiMo and TileRT Achieve Breakthrough: 1-Trillion-Parameter Model Exceeds 1000 Tokens Per Second on Standard GPUs

Vibe Coding Is Everywhere—But Security Is Still in the Dark

Boosting Recommendation Accuracy with Large Language Models in Python

4 Powerful Techniques to Supercharge Your Claude Code Workflow

Unlocking AI Mastery: 5 Essential Python Concepts Every Engineer Needs

Google Unveils Gemma 4 QAT Checkpoints: Q4_0 and a Revolutionary Mobile Format Slash On-Device Memory

VA EHR Expansion Accelerates: Four New Deployments Signal Nationwide Digital Health Push

Decades of Remote Work: The 7 Laptop-Bag Essentials I Never Leave Home Without

AI Agents Outpace Traditional Search by 48x in Groundbreaking Harvard-Perplexity Study

Chromatix: A Differentiable, GPU-Accelerated Wave-Optics Library

Xiaomi’s MiMo Stuns with Breakthrough: Outpacing ChatGPT and Claude by 15 Times

Taming the Flood: Strategies to Lighten Your Tier 1 Burden

Unleashing Speed at Scale: KubeVirt Performance Reimagined with VirtBench

The CISO’s Playbook for Smarter Data Minimization

Trending

VA EHR Expansion Accelerates: Four New Deployments Signal Nationwide Digital Health Push

Decades of Remote Work: The 7 Laptop-Bag Essentials I Never Leave Home Without

Latest Posts

Not More Data, but Better World Models – Unite.AI

OpenAI Is Hiring Head of Preparedness, Amid AI Cyberattack Fears

Subscribe to Updates

Top Posts

Chromatix: A Differentiable, GPU-Accelerated Wave-Optics Library

Design and Implementation

Differentiability

Composability

Scalability

Implementation

Experiments

Inverse problems for reconstructing samples

Programmable optics and deep learning

Flexible modeling with optical building blocks

High performance through parallelization

Related Posts