Advantages of Dimension Names and Types¶

Colab   •   🌐 ΦML   •   📖 Documentation   •   🔗 API   •   ▶ Videos   •   Examples

In [1]:
%%capture
!pip install phiml
from phiml import math
from phiml.math import spatial, instance, channel, batch
import torch

The interplay between dimension types and names enables user code to be much more concise and expressive. These advantages are hard to explain in abstract, so instead we are going to show the benefits on simple examples.

Gathering and Scattering¶

Operations like gather and scatter -- taking values out of a tensor or putting data into a tensor -- are among the most basic and important operations.

Task: Compute min(0, value) for some values at given indices of a data tensor and write the updated values back to the tensor. The complexity should be independent of the size of data and the code should be differentiable.

Let's look at the Φ-ML version first. We are given the data, ordered in the usual format y,x and indices ordered as x,y.

In [2]:
data = math.tensor([[1, 2, 3], [-4, -5, -6]], spatial('y,x'))
indices = math.tensor([(0, 0), (2, 1)], instance('indices'), channel(idx='x,y'))

We can compute the result by gathering the values at the indices, computing the minimum, and then writing them back.

In [3]:
math.scatter(data, indices, math.minimum(0, data[indices]))
Out[3]:
0; 2; 3; -4; -5; -6 (yˢ=2, xˢ=3) int64

As expected, the 1 at index (0,0) was replaced by a 0 while the -6 at (2,1) was already lower than 0. Also, the channel order was automatically matched to the dimension order since Φ-ML allows us to specify it directly.

Actually, the Φ-ML scatter function already has a mode for computing the minimum, so we could have instead written

In [4]:
math.scatter(data, indices, 0, mode=min)
Out[4]:
0; 2; 3; -4; -5; -6 (yˢ=2, xˢ=3) int64

Now let's look at the same operation in PyTorch, without dimension names.

In [5]:
data = torch.tensor([[1, 2, 3], [-4, -5, -6]])  # y,x
indices = torch.tensor([(0, 0), (2, 1)])  # x,y

It turns out that doing this is quite hard to get right. After our initial attempts at this task failed, we asked two fellow AI researchers using PyTorch for help, both of which could not produce working code within 10 minutes. The following is what ChatGPT came up with, given a detailed description of the task:

In [6]:
try:
    # ChatGPT "solution"
    update_indices = indices[:, [1, 0]]
    update_values = torch.min(torch.zeros_like(update_indices, dtype=data.dtype), data[update_indices[:, 0], update_indices[:, 1]])
    data.scatter_add_(0, update_indices, update_values)
except RuntimeError as err:
    print(err)
index 2 is out of bounds for dimension 0 with size 2

Getting this simple exercise right seems to be quite difficult, both for LLMs and long-time PyTorch users. We will leave this as an exercise to the reader. If you think, you have a solution, check that the code is differentiable as well!

Now imagine, we had a batch dimension on data as well. Let's try this in Φ-ML!

In [7]:
data = math.tensor([[1, 2, 3], [-4, -5, -6]], spatial('y,x'))
indices = math.tensor([(0, 0), (2, 1)], instance('indices'), channel(idx='x,y'))
data *= math.range(batch(b=10))  # this is new!

Our code from above works with this setting as well. To check this, we print batch index 1, which matches the case above.

In [8]:
math.scatter(data, indices, 0, mode=min).b[1]
Out[8]:
0; 2; 3; -4; -5; -6 (yˢ=2, xˢ=3) int64

Making PyTorch code scale with arbitrary batch dimensions is exceedingly difficult. That's why practically all PyTorch code requires inputs with a fixed number of dimensions in a specific order.

Finding Specific Values¶

Task: For each 1D sequence in a batch, find the index of the first 0 that has two positive neighbors.

Φ-ML solution:

In [9]:
data = math.tensor([[0, 1, 0, 2, 0, 1], [-1, 0, 1, 0, 2, 1]], batch('b'), spatial('seq'))
nb_offset = math.tensor([-1, 1], instance('neighbors'))

zero_positions = math.nonzero(data.seq[1:-1] == 0) + 1
valid = math.all(data[zero_positions + nb_offset] > 0, 'neighbors')
zero_positions[valid].nonzero[0]
Out[9]:
(seq=2); (seq=3) (bᵇ=2, vectorᶜ=seq) int64

PyTorch solution:

In [10]:
data = torch.tensor([[0, 1, 0, 2, 0, 1], [-1, 0, 1, 0, 2, 1]])

result = []
for sequence in data:
    zero_positions = torch.nonzero(sequence == 0, as_tuple=True)[0]
    valid_positions = zero_positions[(zero_positions > 0) & (zero_positions < len(sequence) - 1)]
    neighbors_positive = (sequence[valid_positions - 1] > 0) & (sequence[valid_positions + 1] > 0)
    result_index = valid_positions[neighbors_positive][0]
    result.append(result_index)
torch.stack(result, dim=0)
Out[10]:
tensor([2, 3])

Unlike the PyTorch version, Φ-ML can automatically vectorize over the sequences without the user needing to write a for loop.

Laplace Operator¶

Consider the discrete Laplace operator ∇². On a 1D grid, it can be computed with the stencil (1, -2, 1) and in 2D with the stencil (0 1 0 / 1 -4, 1 / 0 1 0).

Task: Implement the Laplace operator for n-dimensional grids.

With Φ-ML's typed dimensions, we can use shift, passing in the dimensions along which we want to shift the data. Then we apply the stencil and sum the components.

In [11]:
data_1d = math.tensor([0, 1, 0], spatial('x'))
data_2d = math.tensor([[0, 0, 0], [0, 1, 0], [0, 0, 0]], spatial('y,x'))

def laplace(x, padding='zero-gradient'):
    left, center, right = math.shift(x, (-1, 0, 1), spatial, padding)
    return math.sum((left + right - 2 * center), 'shift')

print(laplace(data_1d))
math.print(laplace(data_2d))
(1, -2, 1) along xˢ int64
  0,  1,  0,
  1, -4,  1,
  0,  1,  0  along (yˢ=3, xˢ=3)

This automatically generalizes to n dimensions since we shift in all spatial dimensions.

Doing this directly with PyTorch is much more cumbersome. After multiple iterations of generating code with ChatGPT and feeding it back the error message, it converged on the following output:

In [12]:
import torch
import torch.nn.functional as F

# ChatGPT "solution"
def laplace_operator_nd(grid):
    # Get the number of dimensions
    ndim = grid.dim()

    # Construct the Laplace stencil for n-dimensions
    laplace_stencil = torch.zeros((1,) * (ndim - 1) + (1, 3, 3))
    center_idx = tuple(slice(1, 2) for _ in range(ndim - 1)) + (0, 1, 1)
    laplace_stencil[center_idx] = -2
    for i in range(ndim - 1):
        laplace_stencil = laplace_stencil.narrow(i, 0, 1).clone()

    # Apply the convolution along each dimension
    laplace_result = grid.clone()
    for i in range(ndim):
        laplace_result = F.conv1d(laplace_result.unsqueeze(0), laplace_stencil.to(laplace_result.device), padding=1)
        laplace_result = laplace_result.squeeze(0)

    return laplace_result


data_1d = torch.tensor([1, 2, 3, 4, 5])
try:
    result_1d = laplace_operator_nd(data_1d)
except RuntimeError as err:
    print(err)
Given groups=1, weight of size [1, 3, 3], expected input[1, 1, 5] to have 3 channels, but got 1 channels instead

The n-dimensional laplace seems to be too difficult for current LLMs to handle with PyTorch, indicating that the API is not well-suited to the task. However, ChatGPT is able to generate versions for a fixed number of dimensions.

The below output does work for inputs of type float, but an additional cast is required to make it work with our example. These data type problems are always resolved under-the-hood in Φ-ML. Our version even accepts bool and complex inputs, neither of which work with PyTorch out-of-the-box.

In [13]:
# ChatGPT solution for 1D laplace
def laplace_operator_1d(grid):
    laplace_stencil = torch.Tensor([1, -2, 1]).view(1, 1, -1)
    laplace_result = F.conv1d(grid.view(1, 1, -1), laplace_stencil, padding=1)
    return laplace_result.view(-1)

# Example usage:
grid_1d = torch.tensor([0, 1, 0])
try:
    result_1d = laplace_operator_1d(grid_1d)
except RuntimeError as err:
    print(err)
expected scalar type Long but found Float

Further Reading¶

Dimension names and types are organized in the shapes of tensors.

Also see the introduction to tensors.

🌐 ΦML   •   📖 Documentation   •   🔗 API   •   ▶ Videos   •   Examples