%%capture
!pip install phiml
from phiml import math
Like Jax, ΦML provides a functional approach to automatic differentiation.
You can obtain the derivative of a function using math.gradient()
.
Note that we have to set the backend to either Jax, PyTorch or TensorFlow since NumPy does not support automatic differentiation.
math.use('torch')
def loss_function(x, y):
return x ** 2 * y
dx_function = math.gradient(loss_function, wrt='x')
dx_function(x=1., y=1.)
/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/phiml/math/_functional.py:628: RuntimeWarning: Using torch for gradient computation because numpy does not support jacobian() warnings.warn(f"Using {math.default_backend()} for gradient computation because {key.backend} does not support jacobian()", RuntimeWarning)
(tensor(1., grad_fn=<MulBackward0>), tensor(2.))
By default, the gradient function also returns the output of the original function.
In the above case, the loss value is 1
and the gradient is 2
.
We can get the gradient only by passing get_output=False
.
dx_function = math.gradient(loss_function, wrt='x', get_output=False)
dx_function(x=1., y=1.)
tensor(2.)
Since we passed in native types (not ΦML tensors), we also get native types as a result.
Let's pass a tensor for x
instead.
x = math.wrap([0, 1, 2], math.channel('values'))
try:
dx_function(x, y=1.)
except Exception as exc:
print(exc)
Loss must be reduced to a scalar
This failed because gradient()
requires our function to return a scalar or batched scalar, but we returned three values along a spatial axis.
This restriction applies to all dimension types except for batch dimensions which are automatically summed over.
x = math.wrap([0, 1, 2], math.batch('values'))
dx_function(x, y=1.)
(0.000, 2.000, 4.000) along valuesᵇ
def loss_function(x, y):
return math.l2_loss(x ** 2 * y)
dx_function = math.gradient(loss_function, wrt='x', get_output=False)
dx_function(x, y=1.)
(0.000, 2.000, 16.000) along valuesᵇ
We can get the gradients w.r.t. multiple values by passing multiple strings or a comma-separated str
.
math.gradient(loss_function, wrt='x,y', get_output=False)(1, 1)
[tensor(2.), tensor(1.)]
You can also compute the gradient w.r.t. pytrees and dataclasses.
from dataclasses import dataclass
@dataclass
class Vec:
x1: math.Tensor
x2: math.Tensor
def __mul__(self, other):
return Vec(self.x1 * other, self.x2 * other)
def __pow__(self, power, modulo=None):
return Vec(self.x1 ** power, self.x2 ** power)
def __value_attrs__(self):
return 'x1', 'x2'
dx_function(x=Vec(1, 2), y=1.)
Vec(x1=tensor(2.), x2=tensor(16.))
Here, we create the custom class Vec
which holds two properties, x1
and x2
.
In __value_attrs__
, we declare that both members should be considered as values for value operations, such as l2_loss
.
The analog method __variable_attrs__
defines which values should be considered for automatic differentiation.
This defaults to all variables if not implemented.
ΦML also provides finite difference differential operators.
Another important function transformation is JIT-compilation.
When training neural networks, the gradient is typically computed under-the-hood.
🌐 ΦML • 📖 Documentation • 🔗 API • ▶ Videos • Examples