• 🌐 ΦML • 📖 Documentation • 🔗 API • ▶ Videos • Examples
This notebook introduces matrices in ΦML. Also check out the introduction to linear solves.
%%capture
!pip install phiml
Matrices are typically represented as a 2D array with N rows and M columns. They can be multiplied with an M-vector, producing an N-vector, i.e. the horizontal axis of the matrix is reduced in the operation.
ΦML generally deals with higher-dimensional tensors.
Say we want to represent a matrix that transforms one image into another, then both vectors would have shape (height, width, channels)
, which is too much for a matrix.
This is typically resolved by packing these dimensions into one using reshaping, adding boilerplate and making code less readable.
ΦML allows you to keep all dimensions.
This is possible because of dimension types.
ΦML provides the dual
dimension type to mark dimensions that will be reduced during a matrix multiplication.
from phiml import math
from phiml.math import wrap, channel, dual
matrix = wrap([[0, 1], [-1, 0]], channel('rows'), dual('cols'))
math.print(matrix)
rows=0 0 1 along ~cols rows=1 -1 0 along ~cols
Here, the cols
dimension is marked as dual
and will be reduced against vectors in a matrix multiplication.
Note that the names of dual dimensions begin with ~
to differentiate them from primal (non-dual) dimensions and avoid name conflicts.
A matrix mapping images to images would have shape (~height, ~width, ~channels, height, width, channels)
and the first three dimensions would be reduced during multiplication.
Let's perform a matrix multiplication with matrix
!
Dual dimensions are matched against vector dimensions of the same name.
vector = wrap([2, 3], channel('cols'))
matrix @ vector
(3, -2) along rowsᶜ int64
For matrices with only one dual dimension, other vector dimension names are also allowed.
vector = wrap([2, 3], channel('vector_dim'))
matrix @ vector
--------------------------------------------------------------------------- AssertionError Traceback (most recent call last) Cell In[4], line 2 1 vector = wrap([2, 3], channel('vector_dim')) ----> 2 matrix @ vector File /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/phiml/math/_tensors.py:803, in Tensor.__matmul__(self, other) 801 if not match_primal: 802 assert non_batch(other).non_dual.rank == 1, f"Cannot multiply {self.shape} @ {other.shape} because arg2 does not have appropriate non-dual dimensions" --> 803 assert non_batch(other).non_dual.size == match_primal.volume, f"Cannot multiply {self.shape} @ {other.shape} because dual dims of arg1 have no match" 804 match_primal = non_batch(other).non_dual 805 match_dual = self.shape.dual.only(match_primal.as_dual(), reorder=True) AssertionError: Cannot multiply (~colsᵈ=2, rowsᶜ=2) @ (vector_dimᶜ=2) because dual dims of arg1 have no match
Matrix multiplications are a special case of a dot product.
You can perform general multiply-reduce using math.dot()
or using the dimension referencing syntax:
vec1 = wrap([[0, 1], [-1, 0]], channel('c1,c2'))
vec2 = wrap([2, 3], channel('c3'))
vec1.c2 @ vec2.c3
(3, -2) along c1ᶜ int64
Here, we performed matrix multiplication by explicitly specifying the dimensions to be reduced.
ΦML can convert linear Python functions into (sparse) matrices.
To do this, pass an example input vector (matching the dual / column dimensions of the matrix) to math.matrix_from_function()
.
from phiml.math import matrix_from_function, zeros
def mul_by_2(x):
return 2 * x
example_input = zeros(channel(vector=3))
matrix, bias = matrix_from_function(mul_by_2, example_input)
matrix
2.0
In this simple case, the matrix was identified to be a scalar. Let's multiply all but the nth element by 0.
def mask_first(x, n: int):
one_hot = math.range(x.shape) == n
return x * one_hot
matrix, bias = matrix_from_function(mask_first, example_input, n=2)
math.print(matrix)
vector=0 0. 0. 0. along ~vector vector=1 0. 0. 0. along ~vector vector=2 0. 0. 1. along ~vector
Now we got a proper 3x3 matrix.
Since our example input had a dimension named vector
, the columns of the resulting matrix are called ~vector
.
In this case, the returned matrix only contains a single non-zero entry.
matrix
sparse coo (~vectorᵈ=3, vectorᶜ=3) with 1 entries: (entriesⁱ=1) const 1.0
Next, we create a banded matrix.
def finite_difference_gradient(x):
return x[1:] - x[:-1]
matrix, bias = matrix_from_function(finite_difference_gradient, example_input)
math.print(matrix)
vector=0 -1. 1. 0. along ~vector vector=1 0. -1. 1. along ~vector
This gives us a 2x3 matrix since the output is one shorter than the input. We can fix this by padding the input first.
def finite_difference_gradient(x, padding):
x = math.pad(x, (0, 1), padding)
return x[1:] - x[:-1]
matrix, bias = matrix_from_function(finite_difference_gradient, example_input, padding=0)
math.print(matrix)
vector=0 -1. 1. 0. along ~vector vector=1 0. -1. 1. along ~vector vector=2 0. 0. -1. along ~vector
Depending on what padding we use, we get different matrices.
matrix, bias = matrix_from_function(finite_difference_gradient, example_input, padding=math.extrapolation.PERIODIC)
math.print(matrix)
vector=0 -1. 1. 0. along ~vector vector=1 0. -1. 1. along ~vector vector=2 1. 0. -1. along ~vector
matrix, bias = matrix_from_function(finite_difference_gradient, example_input, padding=math.extrapolation.ZERO_GRADIENT)
math.print(matrix)
vector=0 -1. 1. 0. along ~vector vector=1 0. -1. 1. along ~vector vector=2 0. 0. 0. along ~vector
So far, the bias has always been zero because our functions did not add any constants to the output. With a constant non-zero padding, this changes.
matrix, bias = matrix_from_function(finite_difference_gradient, example_input, padding=1)
math.print(matrix)
bias
vector=0 -1. 1. 0. along ~vector vector=1 0. -1. 1. along ~vector vector=2 0. 0. -1. along ~vector
(0.000, 0.000, 1.000)
Here, we get the same matrix as with padding=0
, but the bias has picked up a non-zero term for the last entry.
Note that the constructed matrix will prefer NumPy, even when a different backend is selected. When running a linear solve within a JIT-compiled function, this allows for matrix and preconditioner construction at compile time, not at runtime.
math.use('torch')
matrix, bias = matrix_from_function(finite_difference_gradient, example_input, padding=1)
matrix.default_backend
numpy
Matrices can be converted to the selected backend using convert
.
matrix = math.convert(matrix)
matrix.default_backend
torch
Dense matrices are simply tensors that have one or multiple dual dimensions. Creating a sparse matrix is therefore as simple as setting the correct dimension type on a tensor.
Sparse matrices can be created using math.sparse_tensor()
.
All non-zero values need to be specified together with their indices.
from phiml.math import instance, expand
indices = wrap([(0, 0), (1, 1)], instance('indices'), channel(vector='rows,~cols'))
values = expand(wrap(1.), instance(indices))
dense_shape = channel(rows=3) & dual(cols=3)
matrix = math.sparse_tensor(indices, values, dense_shape, format='coo', can_contain_double_entries=False)
math.print(matrix)
rows=0 1. 0. 0. along ~cols rows=1 0. 1. 0. along ~cols rows=2 0. 0. 0. along ~cols
The indices
specify the index for each sparse dimension (rows
and ~cols
in this case) along a channel dimension named vector
.
The item names must match the sparse dimension names but the order is irrelevant.
The dimension enumerating the different non-zero values must be an instance dimension and must also be present on values
.
The format
option allows the creation of different kinds of sparse matrices. coo
stands for coordinate format, which basically keeps the indices
and `values´ tensors as-is.
matrix
float64 sparse coo (~colsᵈ=3, rowsᶜ=3) with 2 entries: (indicesⁱ=2) float64 const 1.0
Other formats include compressed sparse row, csr
, which compresses the row (primal) indices, and csc
, which compresses the column (dual) indices.
math.sparse_tensor(indices, values, dense_shape, format='csr')
float64 sparse csr (~colsᵈ=3, rowsᶜ=3) with 2 entries: (sp_entriesⁱ=2) float64 const 1.0
math.sparse_tensor(indices, values, dense_shape, format='csc')
float64 sparse csc (~colsᵈ=3, rowsᶜ=3) with 2 entries: (sp_entriesⁱ=2) float64 const 1.0
If you need to perform many matrix multiplications with a single matrix, csr
is usually a good choice.
If you only use the matrix a handful of times, use coo
.
The csc
format is good for transposed matrix multiplications as well as slicing and concatenating the along column (dual) dimensions.
Matrices can be sliced and concatenated like regular tensors. Note the .dual
instead of ~
for dual dimensions.
matrix.rows[0]
(1.000, 0.000, 0.000) along ~colsᵈ
matrix.cols.dual[0]
(1.000, 0.000, 0.000) along rowsᶜ
matrix.rows[1:]
(0.000, 0.000); (1.000, 0.000); (0.000, 0.000) (~colsᵈ=3, rowsᶜ=2)
row0 = matrix.rows[0:1] * 2
other_rows = matrix.rows[1:]
math.concat([row0, other_rows], 'rows')
sparse coo (~colsᵈ=3, rowsᶜ=3) with 2 entries: (indicesⁱ=2) 1.500 ± 0.500 (1e+00...2e+00)
Matrices with dual dimensions can be used in linear solves.
🌐 ΦML • 📖 Documentation • 🔗 API • ▶ Videos • Examples