%%capture
!pip install phiml
from phiml import math
from phiml.math import spatial, instance, channel, batch
import torch
The interplay between dimension types and names enables user code to be much more concise and expressive. These advantages are hard to explain in abstract, so instead we are going to show the benefits on simple examples.
Operations like gather
and scatter
-- taking values out of a tensor or putting data into a tensor -- are among the most basic and important operations.
Task: Compute min(0, value)
for some values at given indices
of a data
tensor and write the updated values back to the tensor. The complexity should be independent of the size of data
and the code should be differentiable.
Let's look at the Φ-ML version first. We are given the data
, ordered in the usual format y,x and indices
ordered as x,y.
data = math.tensor([[1, 2, 3], [-4, -5, -6]], spatial('y,x'))
indices = math.tensor([(0, 0), (2, 1)], instance('indices'), channel(idx='x,y'))
We can compute the result by gathering the values at the indices, computing the minimum
, and then writing them back.
math.scatter(data, indices, math.minimum(0, data[indices]))
0; 2; 3; -4; -5; -6 (yˢ=2, xˢ=3) int64
As expected, the 1 at index (0,0) was replaced by a 0 while the -6 at (2,1) was already lower than 0. Also, the channel order was automatically matched to the dimension order since Φ-ML allows us to specify it directly.
Actually, the Φ-ML scatter function already has a mode for computing the minimum, so we could have instead written
math.scatter(data, indices, 0, mode=min)
0; 2; 3; -4; -5; -6 (yˢ=2, xˢ=3) int64
Now let's look at the same operation in PyTorch, without dimension names.
data = torch.tensor([[1, 2, 3], [-4, -5, -6]]) # y,x
indices = torch.tensor([(0, 0), (2, 1)]) # x,y
It turns out that doing this is quite hard to get right. After our initial attempts at this task failed, we asked two fellow AI researchers using PyTorch for help, both of which could not produce working code within 10 minutes. The following is what ChatGPT came up with, given a detailed description of the task:
try:
# ChatGPT "solution"
update_indices = indices[:, [1, 0]]
update_values = torch.min(torch.zeros_like(update_indices, dtype=data.dtype), data[update_indices[:, 0], update_indices[:, 1]])
data.scatter_add_(0, update_indices, update_values)
except RuntimeError as err:
print(err)
index 2 is out of bounds for dimension 0 with size 2
Getting this simple exercise right seems to be quite difficult, both for LLMs and long-time PyTorch users. We will leave this as an exercise to the reader. If you think, you have a solution, check that the code is differentiable as well!
Now imagine, we had a batch dimension on data
as well.
Let's try this in Φ-ML!
data = math.tensor([[1, 2, 3], [-4, -5, -6]], spatial('y,x'))
indices = math.tensor([(0, 0), (2, 1)], instance('indices'), channel(idx='x,y'))
data *= math.range(batch(b=10)) # this is new!
Our code from above works with this setting as well. To check this, we print batch index 1, which matches the case above.
math.scatter(data, indices, 0, mode=min).b[1]
0; 2; 3; -4; -5; -6 (yˢ=2, xˢ=3) int64
Making PyTorch code scale with arbitrary batch dimensions is exceedingly difficult. That's why practically all PyTorch code requires inputs with a fixed number of dimensions in a specific order.
Task: For each 1D sequence in a batch, find the index of the first 0
that has two positive neighbors.
Φ-ML solution:
data = math.tensor([[0, 1, 0, 2, 0, 1], [-1, 0, 1, 0, 2, 1]], batch('b'), spatial('seq'))
nb_offset = math.tensor([-1, 1], instance('neighbors'))
zero_positions = math.nonzero(data.seq[1:-1] == 0) + 1
valid = math.all(data[zero_positions + nb_offset] > 0, 'neighbors')
zero_positions[valid].nonzero[0]
(seq=2); (seq=3) (bᵇ=2, vectorᶜ=seq) int64
PyTorch solution:
data = torch.tensor([[0, 1, 0, 2, 0, 1], [-1, 0, 1, 0, 2, 1]])
result = []
for sequence in data:
zero_positions = torch.nonzero(sequence == 0, as_tuple=True)[0]
valid_positions = zero_positions[(zero_positions > 0) & (zero_positions < len(sequence) - 1)]
neighbors_positive = (sequence[valid_positions - 1] > 0) & (sequence[valid_positions + 1] > 0)
result_index = valid_positions[neighbors_positive][0]
result.append(result_index)
torch.stack(result, dim=0)
tensor([2, 3])
Unlike the PyTorch version, Φ-ML can automatically vectorize over the sequences without the user needing to write a for
loop.
Consider the discrete Laplace operator ∇². On a 1D grid, it can be computed with the stencil (1, -2, 1) and in 2D with the stencil (0 1 0 / 1 -4, 1 / 0 1 0).
Task: Implement the Laplace operator for n-dimensional grids.
With Φ-ML's typed dimensions, we can use shift
, passing in the dimensions along which we want to shift the data. Then we apply the stencil and sum the components.
data_1d = math.tensor([0, 1, 0], spatial('x'))
data_2d = math.tensor([[0, 0, 0], [0, 1, 0], [0, 0, 0]], spatial('y,x'))
def laplace(x, padding='zero-gradient'):
left, center, right = math.shift(x, (-1, 0, 1), spatial, padding)
return math.sum((left + right - 2 * center), 'shift')
print(laplace(data_1d))
math.print(laplace(data_2d))
(1, -2, 1) along xˢ int64 0, 1, 0, 1, -4, 1, 0, 1, 0 along (yˢ=3, xˢ=3)
This automatically generalizes to n dimensions since we shift in all spatial dimensions.
Doing this directly with PyTorch is much more cumbersome. After multiple iterations of generating code with ChatGPT and feeding it back the error message, it converged on the following output:
import torch
import torch.nn.functional as F
# ChatGPT "solution"
def laplace_operator_nd(grid):
# Get the number of dimensions
ndim = grid.dim()
# Construct the Laplace stencil for n-dimensions
laplace_stencil = torch.zeros((1,) * (ndim - 1) + (1, 3, 3))
center_idx = tuple(slice(1, 2) for _ in range(ndim - 1)) + (0, 1, 1)
laplace_stencil[center_idx] = -2
for i in range(ndim - 1):
laplace_stencil = laplace_stencil.narrow(i, 0, 1).clone()
# Apply the convolution along each dimension
laplace_result = grid.clone()
for i in range(ndim):
laplace_result = F.conv1d(laplace_result.unsqueeze(0), laplace_stencil.to(laplace_result.device), padding=1)
laplace_result = laplace_result.squeeze(0)
return laplace_result
data_1d = torch.tensor([1, 2, 3, 4, 5])
try:
result_1d = laplace_operator_nd(data_1d)
except RuntimeError as err:
print(err)
Given groups=1, weight of size [1, 3, 3], expected input[1, 1, 5] to have 3 channels, but got 1 channels instead
The n-dimensional laplace seems to be too difficult for current LLMs to handle with PyTorch, indicating that the API is not well-suited to the task. However, ChatGPT is able to generate versions for a fixed number of dimensions.
The below output does work for inputs of type float, but an additional cast is required to make it work with our example.
These data type problems are always resolved under-the-hood in Φ-ML. Our version even accepts bool
and complex
inputs, neither of which work with PyTorch out-of-the-box.
# ChatGPT solution for 1D laplace
def laplace_operator_1d(grid):
laplace_stencil = torch.Tensor([1, -2, 1]).view(1, 1, -1)
laplace_result = F.conv1d(grid.view(1, 1, -1), laplace_stencil, padding=1)
return laplace_result.view(-1)
# Example usage:
grid_1d = torch.tensor([0, 1, 0])
try:
result_1d = laplace_operator_1d(grid_1d)
except RuntimeError as err:
print(err)
expected scalar type Long but found Float
Dimension names and types are organized in the shapes of tensors.
Also see the introduction to tensors.
🌐 ΦML • 📖 Documentation • 🔗 API • ▶ Videos • Examples