Neural Emulator Superiority: When Machine Learning for PDEs Surpasses its Training Data

TL;DR

We show that neural networks can outperform the numerical simulator that produced their training data.

Abstract

Neural operators or emulators for PDEs trained on data from numerical solvers are conventionally assumed to be limited by their training data's fidelity. We challenge this assumption by identifying "emulator superiority," where neural networks trained purely on low-fidelity solver data can achieve higher accuracy than those solvers when evaluated against a higher-fidelity reference. Our theoretical analysis reveals how the interplay between emulator inductive biases, training objectives, and numerical error characteristics enables superior performance during multi-step rollouts. We empirically validate this finding across different PDEs using standard neural architectures, demonstrating that emulators can implicitly learn dynamics that are more regularized or exhibit more favorable error accumulation properties than their training data, potentially surpassing training data limitations and mitigating numerical artifacts. This work prompts a re-evaluation of emulator benchmarking, suggesting neural emulators might achieve greater physical fidelity than their training source within specific operational regimes.

The Simulation Trade-Off: Accurate vs. Fast

Simulating physical systems governed by Partial Differential Equations (PDEs) involves a fundamental trade-off between accuracy and computational cost. While we have highly accurate numerical solvers (the 'fine P' ✅ in our paper), they can be incredibly slow. For many applications, we rely on faster, 'coarse P' 🚀 solvers. These solvers often introduce errors, like the numerical diffusion seen here, which makes the simulation appear blurry. This is an accepted compromise to make simulations computationally tractable.

High Accuracy ≈ High Computational Cost

The Conventional Wisdom: An Emulator is Limited by its Teacher

The standard approach for building a neural emulator is to train it to mimic a simulator. The prevailing assumption is that the emulator's accuracy is fundamentally capped by the fidelity of the training data. If you train a network on blurry data from a coarse solver, you expect to get a blurry model. In other words, the emulator can't be better than its teacher.

Error(Emulator 🤖) ≥ Error(Training Data 🚀)
(This is the assumption we challenge)

The Surprising Truth: Outperforming the Training Data

However, this conventional wisdom overlooks crucial details. We demonstrate that a neural emulator, trained purely on low-fidelity data, can produce results that are more accurate than that data. As the video shows, our emulator learns to produce a sharper, more physically realistic simulation than the coarse solver it was trained on. How is this possible?

Inductive Biases: Neural network architectures aren't blank slates. A Convolutional Neural Network (ConvNet), for example, has a natural preference for local patterns. This "bias" acts as a regularizer, implicitly filtering out the structured errors of the coarse solver.
Different Goals: Training and evaluation are different. We train the emulator to predict the next single step perfectly. But we evaluate it over a long, multi-step "rollout." The emulator can learn dynamics with more favorable error accumulation properties, making it more accurate in the long run than the coarse solver.

We define the Superiority Ratio (ξ) to measure this effect. When ξ < 1, the emulator is superior to its training data. $$\xi = \frac{\text{Error(🤖 Emulator vs. ✅ Ground Truth)}}{\text{Error(🚀 Training Data vs. ✅ Ground Truth)}}$$

The Evidence Part 1: Theoretical Proof

This isn't just an empirical trick. For fundamental linear PDEs (Advection, Diffusion, and Poisson), we can prove the existence of superiority mathematically. Using Fourier analysis, we show exactly how an emulator's simple functional form (its inductive bias) allows it to generalize better than the more complex, but imperfect, numerical solver it was trained on.

The plot shows the superiority ratio (y-axis) across different frequency modes (x-axis) when emulating the advection equation. When the line dips below 1.0, the emulator is more accurate than its training data for those frequencies. 'Forward superiority' occurs when the emulator, trained on low-frequency information (ψ), generalizes to be more accurate at high frequencies (φ).

The Evidence Part 2: Experimental Validation

The principle of emulator superiority also holds in practice for complex, nonlinear PDEs like the Burgers' equation and across a wide range of modern neural network architectures. We found that architectures with strong spatial or spectral inductive biases (like ConvNets and Fourier Neural Operators) are particularly effective at achieving superiority. This confirms that inductive bias is a key ingredient.

A common approximation strategy for simulating fluid-related phenomena with implicit time integration is to truncate the nonlinear solver iterations. While this greatly speeds up the simulation time, it can lead to nonlinear phenomena being not fully resolved, like the incorrect shock propagation in the Burgers example above (a). Surprisingly, a UNet emulator when trained only on this coarse data learns a better shock propagation (b) more in line with the resolved high-fidelity solution (c), leading to superiority in the rollout (d).

Why This Matters: A New Perspective on Benchmarking

Many current benchmarks for neural PDE solvers use data from numerical simulators as the ground truth. Since imperfections in simulation algorithms are unavoidable, they permeate into datasets. Our work shows this can be misleading. A superior emulator that correctly learns the underlying physics might be unfairly penalized for not reproducing the numerical errors of the flawed reference data.

When benchmarking neural emulators, it's crucial to consider the fidelity of the training data. Evaluations should ideally be against higher-fidelity references to truly assess physical accuracy.
Designing benchmarks that account for the potential superiority of emulators can lead to more meaningful assessments of their capabilities. This might involve using experimental data or high-fidelity simulations as ground truth where possible.

BibTeX


@article{koehler2025neural,
    title={Neural Emulator Superiority: When Machine Learning for {PDE}s Surpasses its Training Data},
    author={Felix Koehler and Nils Thuerey},
    journal={Advances in Neural Information Processing Systems (NeurIPS)},
    volume={39},
    year={2025}
}