APEBench: A Benchmark for Autoregressive Neural Emulators of PDEs

NeurIPS 2024

Technical University of Munich

TL;DR

APEBench is a JAX-based tool to evaluate autoregressive neural emulators for PDEs on periodic domains in 1d, 2d, and 3d. It comes with an efficient reference simulator based on spectral methods that is used for procedural data generation (no need to download large datasets with APEBench). Since this simulator can also be embedded into emulator training (e.g., for a "solver-in-the-loop" correction setting), this is the first benchmark suite to support differentiable physics.

Abstract

We introduce the Autoregressive PDE Emulator Benchmark (APEBench), a comprehensive benchmark suite to evaluate autoregressive neural emulators for solving partial differential equations. APEBench is based on JAX and provides a seamlessly integrated differentiable simulation framework employing efficient pseudo-spectral methods, enabling 46 distinct PDEs across 1D, 2D, and 3D. Facilitating systematic analysis and comparison of learned emulators, we propose a novel taxonomy for unrolled training and introduce a unique identifier for PDE dynamics that directly relates to the stability criteria of classical numerical methods. APEBench enables the evaluation of diverse neural architectures, and unlike existing benchmarks, its tight integration of the solver enables support for differentiable physics training and neural-hybrid emulators. Moreover, APEBench emphasizes rollout metrics to understand temporal generalization, providing insights into the long-term behavior of emulating PDE dynamics. In several experiments, we highlight the similarities between neural emulators and numerical simulators.

Focus on Rollout Performance

Rather than temporally aggregated metrics, APEBench always returns rollout errors to understand temporal generalization. The evaluation can be done in a wide range of metrics, e.g., classical normalized RMSE metrics, Fourier-based mectrics for certain frequency ranges, and Sobolov-based metrics (H1) that point out mismatches in higher frequencies.

Unified PDE identifiers

Describing Dynamics with a reduced set of information creates an exchange protocol that uniquely identifies an experiment. These "difficulties" also encode the challenge of neural emulation by including spatial resolution and spatial dimensions.
For example, in the animation below over diffusivity, convectivity, and dispersivity we can describe a wide range of PDEs: Diffusion, Burgers, Korteweg-de Vries, Dispersion, and Dispersion-Diffusion.

Built-In Support for Unrolled Training

APEBench is built around autoregressive emulation and hence emphasizes the temporal axis in emulator learning. This includes the option for unrolled training (also called autoregressive/recursive/rollout training). We unify many approaches in terms of main chain (=unrolled) length T and branch chain (=reference) length B.
One-Step supervised training is T=B=1, while five-step unrolled training is T=B=5. Branch-one diverted chain training is T=5, B=1. The latter requires a differentiable solver, readily available in APEBench.

A wide range of Dynamics

Accessible via the reduced difficulty or normalized interfaces or via a physical interface. Most dynamics are available in 1D, 2D, and 3D.
Difficulty Phsical Normalized
diff_lin phy_poisson norm_lin
diff_lin_simple phy_sh norm_adv
diff_adv phy_gs norm_diff
diff_diff phy_gs_type norm_adv_diff
diff_adv_diff phy_decay_turb norm_disp
diff_disp phy_kolm_flow norm_fisher
diff_hyp_diff phy_lin norm_four
diff_four phy_lin_simple norm_hypdiff
diff_conv phy_adv norm_nonlin
diff_burgers phy_diff norm_conv
diff_kdv phy_adv_diff norm_burgers
diff_ks_cons phy_disp norm_kdv
diff_ks phy_hyp_diff norm_ks_cons
diff_nonlin phy_four norm_ks
diff_burgers_sc phy_nonlin norm_burgers_sc
diff_fisher phy_burgers_sc norm_lin_simple
phy_kdv
phy_ks
phy_conv
phy_burgers
phy_ks_cons
phy_poly
phy_fisher
phy_unbal_adv
phy_diag_diff
phy_aniso_diff
phy_mix_disp
phy_mix_hyp

Procedural data generation

APEBench's embedded pseudo-spectral solver is very efficient. All training and test data is procedurally (deterministically) generated on-the-fly. There is no need to download large datasets, and the simulator can be embedded into the training loop for all kinds of differentiable physics like "solver-in-the-loop" correction setups.

Simplified Study Workflow

An APEBench study is a list of dictionaries which APEBench executes and conviently returns Pandas dataframes that can be used for statistical postprocessing (e.g., over random seeds) via Seaborn.

Seed Statistics are a first-class Citizen

APEBench inherently supports re-running experiments with different random seeds (for network initialization, stochastic minibatching and optionally also for the procedural data generation). Seed statistics allow for clearly determining a superior emulator architecture or learning methodology based on hypothesis testing. For 1D scenarios, APEBench can parallelize multiple seeds on one GPU to obtain seed statistics virtually for free.

Integrated Volume Renderer

APEBench is accompanied by an efficient (Rust/WebGPU-based) volume renderer to quickly visualize 2D and 3D trajectories. Try it yourself based on a five-axis NumPy array (time x channel x space_0 x space_1 x space_2) or with precomputed Gray-Scott data (Caution! This downloads ~100MB for the opened tab).

The Relation between Neural Emulators and Numerical Simulators

The fine-grained control over the emulation scenarios allows for drawing analogies between neural emulation and classical numerical simulation. For example, (a) the performance of convolution-based architectures is bound by their receptive field and the difficulty (γ₁ = CFL) of the advection scenario. On the other hand, (a) the pseudo-spectral FNO architecture is agnostic to changes in γ₁. (b) For the highest difficulty, unrolling improves the accuracy of the ResNet.

Benchmark Neural-Hybrid Emulators with Differentiable Physics

With the embedded differentiable solver, APEBench can investigate neural-hybrid correction setups. For example, if both ResNet and FNO are used either as full prediction emulators or neural-hybrid emulators for 2D advection (γ₁=10.5) with a coarse solver doing 10% or 50% of the difficulty. Training with unrolling benefits the limited receptive field ResNet yet only shows marginal improvement for the FNO. The ResNet can work in symbiosis with a coarse simulator.

A wide range of architectures and PDE dynamics

The wide range of PDE dynamics in 1D, 2D, and 3D allows for drawing further analogies. In the paper, we investigated a subset of the 46 PDE dynamics.

APEBench's experiments suggest neural architectures

Ultimately, with the studies conducted for the APEBench paper, we can suggest emulator architectures with the following decision tree.

BibTeX


@article{koehler2024apebench,
    title={{APEBench}: A Benchmark for Autoregressive Neural Emulators of {PDE}s},
    author={Felix Koehler and Simon Niedermayr and R{\"}udiger Westermann and Nils Thuerey},
    journal={Advances in Neural Information Processing Systems (NeurIPS)},
    volume={38},
    year={2024}
}