Matrices in ΦML¶

Colab   •   🌐 ΦML   •   📖 Documentation   •   🔗 API   •   ▶ Videos   •   Examples

This notebook introduces matrices in ΦML. Also check out the introduction to linear solves.

In [1]:
%%capture
!pip install phiml

Primal and Dual Dimensions¶

Matrices are typically represented as a 2D array with N rows and M columns. They can be multiplied with an M-vector, producing an N-vector, i.e. the horizontal axis of the matrix is reduced in the operation.

ΦML generally deals with higher-dimensional tensors. Say we want to represent a matrix that transforms one image into another, then both vectors would have shape (height, width, channels), which is too much for a matrix. This is typically resolved by packing these dimensions into one using reshaping, adding boilerplate and making code less readable.

ΦML allows you to keep all dimensions. This is possible because of dimension types. ΦML provides the dual dimension type to mark dimensions that will be reduced during a matrix multiplication.

In [2]:
from phiml import math
from phiml.math import wrap, channel, dual

matrix = wrap([[0, 1], [-1, 0]], channel('rows'), dual('cols'))
math.print(matrix)
rows=0      0   1  along ~cols
rows=1     -1   0  along ~cols

Here, the cols dimension is marked as dualand will be reduced against vectors in a matrix multiplication. Note that the names of dual dimensions begin with ~ to differentiate them from primal (non-dual) dimensions and avoid name conflicts. A matrix mapping images to images would have shape (~height, ~width, ~channels, height, width, channels) and the first three dimensions would be reduced during multiplication.

Let's perform a matrix multiplication with matrix! Dual dimensions are matched against vector dimensions of the same name.

In [3]:
vector = wrap([2, 3], channel('cols'))
matrix @ vector
Out[3]:
(3, -2) along rowsᶜ int64

For matrices with only one dual dimension, other vector dimension names are also allowed.

In [4]:
vector = wrap([2, 3], channel('vector_dim'))
matrix @ vector
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
Cell In[4], line 2
      1 vector = wrap([2, 3], channel('vector_dim'))
----> 2 matrix @ vector

File /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/phiml/math/_tensors.py:771, in Tensor.__matmul__(self, other)
    769 if not match_primal:
    770     assert non_batch(other).non_dual.rank == 1, f"Cannot multiply {self.shape} @ {other.shape} because arg2 does not have appropriate non-dual dimensions"
--> 771     assert non_batch(other).non_dual.size == match_primal.volume, f"Cannot multiply {self.shape} @ {other.shape} because dual dims of arg1 have no match"
    772     match_primal = non_batch(other).non_dual
    773 match_dual = self.shape.dual.only(match_primal.as_dual(), reorder=True)

AssertionError: Cannot multiply (~colsᵈ=2, rowsᶜ=2) @ (vector_dimᶜ=2) because dual dims of arg1 have no match

Dot Product between Vectors¶

Matrix multiplications are a special case of a dot product. You can perform general multiply-reduce using math.dot() or using the dimension referencing syntax:

In [5]:
vec1 = wrap([[0, 1], [-1, 0]], channel('c1,c2'))
vec2 = wrap([2, 3], channel('c3'))
vec1.c2 @ vec2.c3
Out[5]:
(3, -2) along c1ᶜ int64

Here, we performed matrix multiplication by explicitly specifying the dimensions to be reduced.

Building Matrices from Linear Functions¶

ΦML can convert linear Python functions into (sparse) matrices. To do this, pass an example input vector (matching the dual / column dimensions of the matrix) to math.matrix_from_function().

In [6]:
from phiml.math import matrix_from_function, zeros

def mul_by_2(x):
    return 2 * x

example_input = zeros(channel(vector=3))
matrix, bias = matrix_from_function(mul_by_2, example_input)
matrix
Out[6]:
2.0

In this simple case, the matrix was identified to be a scalar. Let's multiply all but the nth element by 0.

In [7]:
def mask_first(x, n: int):
    one_hot = math.range(x.shape) == n
    return x * one_hot

matrix, bias = matrix_from_function(mask_first, example_input, n=2)
math.print(matrix)
vector=0     0.  0.  0.  along ~vector
vector=1     0.  0.  0.  along ~vector
vector=2     0.  0.  1.  along ~vector

Now we got a proper 3x3 matrix. Since our example input had a dimension named vector, the columns of the resulting matrix are called ~vector.

In this case, the returned matrix only contains a single non-zero entry.

In [8]:
matrix
Out[8]:
sparse coo (~vectorᵈ=3, vectorᶜ=3) with 1 entries: (entriesⁱ=1) const 1.0

Next, we create a banded matrix.

In [9]:
def finite_difference_gradient(x):
    return x[1:] - x[:-1]

matrix, bias = matrix_from_function(finite_difference_gradient, example_input)
math.print(matrix)
vector=0     -1.   1.   0.  along ~vector
vector=1      0.  -1.   1.  along ~vector

This gives us a 2x3 matrix since the output is one shorter than the input. We can fix this by padding the input first.

In [10]:
def finite_difference_gradient(x, padding):
    x = math.pad(x, (0, 1), padding)
    return x[1:] - x[:-1]

matrix, bias = matrix_from_function(finite_difference_gradient, example_input, padding=0)
math.print(matrix)
vector=0     -1.   1.   0.  along ~vector
vector=1      0.  -1.   1.  along ~vector
vector=2      0.   0.  -1.  along ~vector

Depending on what padding we use, we get different matrices.

In [11]:
matrix, bias = matrix_from_function(finite_difference_gradient, example_input, padding=math.extrapolation.PERIODIC)
math.print(matrix)
vector=0     -1.   1.   0.  along ~vector
vector=1      0.  -1.   1.  along ~vector
vector=2      1.   0.  -1.  along ~vector
In [12]:
matrix, bias = matrix_from_function(finite_difference_gradient, example_input, padding=math.extrapolation.ZERO_GRADIENT)
math.print(matrix)
vector=0     -1.   1.   0.  along ~vector
vector=1      0.  -1.   1.  along ~vector
vector=2      0.   0.   0.  along ~vector

So far, the bias has always been zero because our functions did not add any constants to the output. With a constant non-zero padding, this changes.

In [13]:
matrix, bias = matrix_from_function(finite_difference_gradient, example_input, padding=1)
math.print(matrix)
bias
vector=0     -1.   1.   0.  along ~vector
vector=1      0.  -1.   1.  along ~vector
vector=2      0.   0.  -1.  along ~vector
Out[13]:
(0.000, 0.000, 1.000)

Here, we get the same matrix as with padding=0, but the bias has picked up a non-zero term for the last entry.

Note that the constructed matrix will prefer NumPy, even when a different backend is selected. When running a linear solve within a JIT-compiled function, this allows for matrix and preconditioner construction at compile time, not at runtime.

In [14]:
math.use('torch')
matrix, bias = matrix_from_function(finite_difference_gradient, example_input, padding=1)
matrix.default_backend
Out[14]:
numpy

Matrices can be converted to the selected backend using convert.

In [15]:
matrix = math.convert(matrix)
matrix.default_backend
Out[15]:
torch

Dense and Sparse Matrices¶

Dense matrices are simply tensors that have one or multiple dual dimensions. Creating a sparse matrix is therefore as simple as setting the correct dimension type on a tensor.

Sparse matrices can be created using math.sparse_tensor(). All non-zero values need to be specified together with their indices.

In [16]:
from phiml.math import instance, expand

indices = wrap([(0, 0), (1, 1)], instance('indices'), channel(vector='rows,~cols'))
values = expand(wrap(1.), instance(indices))
dense_shape = channel(rows=3) & dual(cols=3)
matrix = math.sparse_tensor(indices, values, dense_shape, format='coo', can_contain_double_entries=False)
math.print(matrix)
rows=0     1.  0.  0.  along ~cols
rows=1     0.  1.  0.  along ~cols
rows=2     0.  0.  0.  along ~cols

The indices specify the index for each sparse dimension (rows and ~cols in this case) along a channel dimension named vector. The item names must match the sparse dimension names but the order is irrelevant. The dimension enumerating the different non-zero values must be an instance dimension and must also be present on values.

The format option allows the creation of different kinds of sparse matrices. coo stands for coordinate format, which basically keeps the indices and `values´ tensors as-is.

In [17]:
matrix
Out[17]:
float64 sparse coo (~colsᵈ=3, rowsᶜ=3) with 2 entries: (indicesⁱ=2) float64 const 1.0

Other formats include compressed sparse row, csr, which compresses the row (primal) indices, and csc, which compresses the column (dual) indices.

In [18]:
math.sparse_tensor(indices, values, dense_shape, format='csr')
Out[18]:
float64 sparse csr (~colsᵈ=3, rowsᶜ=3) with 2 entries: (sp_entriesⁱ=2) float64 const 1.0
In [19]:
math.sparse_tensor(indices, values, dense_shape, format='csc')
Out[19]:
float64 sparse csc (~colsᵈ=3, rowsᶜ=3) with 2 entries: (sp_entriesⁱ=2) float64 const 1.0

If you need to perform many matrix multiplications with a single matrix, csr is usually a good choice. If you only use the matrix a handful of times, use coo. The csc format is good for transposed matrix multiplications as well as slicing and concatenating the along column (dual) dimensions.

Matrices can be sliced and concatenated like regular tensors. Note the .dual instead of ~ for dual dimensions.

In [20]:
matrix.rows[0]
Out[20]:
(1.000, 0.000, 0.000) along ~colsᵈ
In [21]:
matrix.cols.dual[0]
Out[21]:
(1.000, 0.000, 0.000) along rowsᶜ
In [22]:
matrix.rows[1:]
Out[22]:
(0.000, 0.000); (1.000, 0.000); (0.000, 0.000) (~colsᵈ=3, rowsᶜ=2)
In [23]:
row0 = matrix.rows[0:1] * 2
other_rows = matrix.rows[1:]
math.concat([row0, other_rows], 'rows')
Out[23]:
sparse coo (~colsᵈ=3, rowsᶜ=3) with 2 entries: (indicesⁱ=2) 1.500 ± 0.500 (1e+00...2e+00)

Further Reading¶

Matrices with dual dimensions can be used in linear solves.

🌐 ΦML   •   📖 Documentation   •   🔗 API   •   ▶ Videos   •   Examples