The backends PyTorch, TensorFlow and Jax have built-in automatic differentiation functionality. However, the respective APIs vary widely in how the gradients are computed. ΦFlow seeks to unify optimization and gradient computation so that code written against the ΦFlow API will work with all backends. Nevertheless, we recommend using backend-specific optimization for certain tasks like neural network training.
The following overview table shows which ΦFlow optimization functions are supported by which backends.
ΦFlow provides functions for solving linear equations as well as finding optimal parameters for non-linear problems. These functions work with all backends but run much slower with NumPy due to the lack of analytic gradients.
solve_nonlinear() solve unconstrained nonlinear optimization problems.
The following example uses L-BFGS-B to find a solution to a nonlinear optimization problem using the
def loss(x1: Grid, x2: Grid) -> math.Tensor: return field.l2_loss(physics(x1, x2) - target) x0 = x0_1, x0_2 solution = math.minimize(math.jit_compile(loss), math.Solve('L-BFGS-B', 0, 1e-3, x0=x0))
For solving linear systems of equations, ΦFlow provides the function
The following example uses the conjugate gradient algorithm to solve_linear
f(x) = y:
@math.jit_compile_linear def f(x: Grid) -> Grid: return field.where(mask, 2 * field.laplace(x), x) x = math.solve_linear(f, y, math.Solve('CG', 1e-3, 0, x0=x0))
Which solver implementation is used, depends on the backend.
However, all backends support conjugate gradient (
'CG') and conjugate gradient with adaptive step size (
'auto' to let ΦFlow chose an appropriate solver.
|‘CG-adaptive’||Φ-Flow||Φ-Flow||Φ-Flow (only function-based)||Φ-Flow|
minimize return only the solution.
To access further information about the optimization, run the optimization within a
When a solve does not find a solution, a subclass of
ConvergenceException is thrown.
solve = math.Solve('CG', 1e-3, 0, max_iterations=300) try: solution = field.solve_linear(..., solve=solve) solution = field.minimize(..., solve=solve) converged = True except NotConverged as not_converged: print(not_converged) last_estimate = not_converged.x except Diverged as diverged: print(diverged) iterations_performed = solve.result.iterations # available in any case
Currently, backpropagation is supported for
The following code shows how to specify different settings for the backpropagation solve:
from phi.flow import * gradient_solve = math.Solve('auto', 1e-5, 0, max_iterations=100, x0=x0) solve = math.Solve('CG', 1e-5, 1e-5, gradient_solve=gradient_solve, x0=None)
Solves during backpropagation raise the same exceptions as in the forward pass.
The only exception is TensorFlow where a warning message is printed instead since exceptions are not supported during backpropagation.
Information about backpropagation solves can be obtained the same ways with the forward solves,
SolveTape context around the gradient computation.
There are two ways to evaluate gradients using ΦFlow: computing gradients of functions and computing gradients of tensors. The former is the preferred way and works will all backends except for NumPy.
compute the gradient of a Python function with respect to one or multiple arguments.
The following example evaluates the gradient of
physics w.r.t. two of its inputs.
def physics(x: Grid, y: Grid, target: Grid) -> math.Tensor: x, y = step(x, y) return field.l2_loss(y - target), x gradient = field.functional_gradient(physics, wrt=[0, 1], get_output=True) loss, x, dx, dy = gradient(...)
In the above example,
wrt=[0, 1] specifies that we are interested in the gradient with respect to the first and second argument of
get_output=True, evaluating the gradient also returns all regular outputs of
dx, dy would be returned.
Note that the first output of
physics must be a scalar
All other outputs are not part of the gradient computation.
Note that higher-order derivatives might not work as expected with some backends.
This method is only supported by PyTorch and TensorFlow.
Operations within a
context record gradient information.
then computes the gradients with respect to a previously marked tensor.
Warning: You have to manually detach PyTorch tensors that will be used outside the context. Higher-order derivatives are not supported for PyTorch.
Recommended for neural network training.
Unfortunately all supported backends have a different approach to computing gradients:
TensorFlow and PyTorch include various optimizers for neural network (NN) training. Additionally, NN variables created through the respective layer functions are typically marked as variables by default, meaning the computational graph for derived tensors is created automatically.
The following script shows how a PyTorch neural network can be trained. See the demo network_training_pytorch.py for a full example.
net = u_net(2, 2) optimizer = optim.Adam(net.parameters(), lr=1e-3) for training_step in range(100): data: Grid = load_training_data(training_step) optimizer.zero_grad() prediction: Grid = field.native_call(net, data) simulation_output: Grid = simulate(prediction) loss = field.l2_loss(simulation_output) loss.native().backward() optimizer.step()
In the above example,
extracts the field values as PyTorch tensors with shape
(batch_size, channels, spatial...),
then calls the network and returs the result again as a
loss is a
phi.math.Tensor, we need to invoke
native() to call PyTorch’s
For TensorFlow, a
GradientTape context is required around network evaluation, physics and loss definition.