In this notebook, we will train a fully-connected neural network to solve an inverse ballistics problem. We will compare supervised training to differentiable physics training, both numerically and visually.

In [1]:

```
# !pip install phiflow
```

In [2]:

```
# from phi.tf.flow import *
from phi.torch.flow import *
# from phi.jax.stax.flow import *
```

In [3]:

```
def simulate_hit(pos, height, vel, angle, gravity=1.):
vel_x, vel_y = math.cos(angle) * vel, math.sin(angle) * vel
height = math.maximum(height, .5)
hit_time = (vel_y + math.sqrt(vel_y**2 + 2 * gravity * height)) / gravity
return pos + vel_x * hit_time, hit_time, height, vel_x, vel_y
simulate_hit(10, 1, 1, 0)[0]
```

Out[3]:

`11.414213`

*y(x)* and sample it on a grid. We have to drop invalid values, such as negative flight times and below-ground points.

In [4]:

```
def sample_trajectory(pos, height, vel, angle, gravity=1.):
hit, hit_time, height, vel_x, vel_y = simulate_hit(pos, height, vel, angle, gravity)
t = math.linspace(0, hit_time, spatial(t=100))
return vec(x=pos + vel_x * t, y=height + vel_y * t - gravity / 2 * t ** 2)
vis.plot(sample_trajectory(tensor(10), 1, 1, math.linspace(-PI/4, 1.5, channel(angle=7))), title="Varying Angle")
```

/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/phi/field/_field.py:142: FutureWarning: Instance checks on Grid are deprecated and will be removed in version 3.0. Use the methods instance.is_grid, instance.is_point_cloud, instance.is_centered and instance.is_staggered instead. return isinstance(self, Grid)

Out[4]:

<Figure size 1200x500 with 1 Axes>

`vel`

. We need to define a loss function to optimize. Here we desire the object to hit at `x=0`

.

In [5]:

```
vel = tensor(1)
def loss_function(vel):
return math.l2_loss(simulate_hit(10, 1, vel, 0)[0] - 0)
loss_function(0)
```

Out[5]:

`50.0`

_{Flow} uses the selected library (PyTorch/TensorFlow/Jax) to derive analytic derivatives.
By default, the gradient function also returns the function value.

In [6]:

```
gradient = math.functional_gradient(loss_function, get_output=True)
gradient(vel)
```

Out[6]:

(65.14213, [16.142136])

In [7]:

```
trj = [vel]
for i in range(10):
loss, (grad,) = gradient(vel)
vel = vel - .2 * grad
trj.append(vel)
print(f"vel={vel} - loss={loss}")
trj = math.stack(trj, channel('opt'))
vis.plot(sample_trajectory(tensor(10), 1, trj, 0))
```

vel=-2.2284272 - loss=65.14213 vel=-4.1654835 - loss=23.451168 vel=-5.3277173 - loss=8.442421 vel=-6.0250573 - loss=3.0392709 vel=-6.4434614 - loss=1.0941381 vel=-6.694504 - loss=0.39388976 vel=-6.8451295 - loss=0.14180061 vel=-6.935505 - loss=0.051048037 vel=-6.9897304 - loss=0.018377367 vel=-7.0222654 - loss=0.0066157645

Out[7]:

<Figure size 1200x500 with 1 Axes>

Now we can just subtract the gradient times a learning rate $\eta = 0.2$ until we converge.

Next, we generate a training set and test by sampling random values.

In [8]:

```
def generate_data(shape):
pos = math.random_normal(shape)
height = math.random_uniform(shape) + .5
vel = math.random_uniform(shape)
angle = math.random_uniform(shape) * PI/2
return math.stack(dict(pos=pos, height=height, vel=vel, angle=angle), channel('vector'))
x_train = generate_data(batch(examples=1000))
x_test = generate_data(batch(examples=1000))
y_train = simulate_hit(*x_train.vector)[0]
y_test = simulate_hit(*x_test.vector)[0]
```

`seed`

to make the weight initialization predictable.

In [9]:

```
math.seed(0)
net_sup = dense_net(1, 4, [32, 64, 32])
net_sup
```

Out[9]:

DenseNet( (linear0): Linear(in_features=1, out_features=32, bias=True) (linear1): Linear(in_features=32, out_features=64, bias=True) (linear2): Linear(in_features=64, out_features=32, bias=True) (linear_out): Linear(in_features=32, out_features=4, bias=True) )

In [10]:

```
# net_sup.trainable_weights[0] # TensorFlow
net_sup.state_dict()['linear0.weight'].flatten() # PyTorch
# net_sup.parameters[0][0] # Stax
```

Out[10]:

tensor([ 0.5317, 0.5576, 0.8676, -0.0228, 0.9547, 0.9196, 0.9320, 0.9324, -0.4410, 0.0031, -0.6949, 0.7739, 0.1919, 0.6169, -0.9259, 0.4986, -0.0187, -0.9493, -0.4455, 0.8291, 0.7643, 0.1477, 0.4587, -0.4650, 0.9987, -0.2630, -0.9132, 0.5971, -0.5870, -0.5184, -0.2275, -0.2534])

For the differentiable physics network we do the same thing again.

In [11]:

```
math.seed(0)
net_dp = dense_net(1, 4, [32, 64, 32])
# net_dp.trainable_weights[0] # TensorFlow
net_dp.state_dict()['linear0.weight'].flatten() # PyTorch
# net_dp.parameters[0][0] # Stax
```

Out[11]:

tensor([ 0.1588, -0.6739, 0.9272, 0.1877, 0.5815, 0.7158, -0.1305, -0.8672, 0.8932, -0.3480, 0.5963, -0.4821, 0.4355, 0.7216, 0.6033, 0.6483, -0.2183, 0.9263, 0.3853, 0.4165, 0.5594, 0.5004, 0.6511, 0.7634, -0.6957, -0.0052, 0.3889, 0.8045, -0.4385, 0.3754, -0.9979, -0.9813])

Now we can train the network. We feed the desired hit position into the network and predict a possible initial state. For supervised training, we compare the network prediction to the ground truth from our training set.

In [12]:

```
opt_sup = adam(net_sup)
def supervised_loss(x, y, net=net_sup):
prediction = math.native_call(net, y)
return math.l2_loss(prediction - x)
print(f"Supervised loss (training set): {supervised_loss(x_train, y_train)}")
print(f"Supervised loss (test set): {supervised_loss(x_test, y_test)}")
```

In [13]:

```
for i in range(100):
update_weights(net_sup, opt_sup, supervised_loss, x_train, y_train)
print(f"Supervised loss (training set): {supervised_loss(x_train, y_train)}")
print(f"Supervised loss (test set): {supervised_loss(x_test, y_test)}")
```

For the differentiable physics loss, we simulate the trajectory given the initial conditions predicted by the network. Then we can measure how close to the desired location the network got.

In [14]:

```
def physics_loss(y, net=net_dp):
prediction = math.native_call(net, y)
y_sim = simulate_hit(*prediction.vector)[0]
return math.l2_loss(y_sim - y)
opt_dp = adam(net_dp)
for i in range(100):
update_weights(net_dp, opt_dp, physics_loss, y_train)
print(f"Supervised loss (training set): {supervised_loss(x_train, y_train, net=net_dp)}")
print(f"Supervised loss (test set): {supervised_loss(x_test, y_test, net=net_dp)}")
```

In [15]:

```
print(f"Supervised network (test set): {physics_loss(y_test, net=net_sup)}")
print(f"Diff.Phys. network (test set): {physics_loss(y_test, net=net_dp)}")
```

Now this is much more promissing. The diff.phys. network seems to hit the desired location very accurately considering it was only trained for 100 iterations. With more training steps, this loss will go down even further, unlike the supervised network.

So what is going on here? Why does the supervised network perform so poorly? The answer lies in the problem itself. The task is multi-modal, i.e. there are many initial states that will hit the same target. The network only gets the target position and must decide on a single initial state. With supervised training, there is no way to know which ground truth solution occurs in the test set. The best the network can do is average nearby solutions from the training set. But since the problem is non-linear, this will give only a rough guess.

The diff.phys. network completely ignores the ground truth solutions which are not even passed to the `physics_loss`

function. Instead, it learns to hit the desired spot, which is exactly what we want.

We can visualize the difference by looking at a couple of trajectories.

In [16]:

```
predictions = math.stack({
"Ground Truth": x_test.examples[:4],
"Supervised": math.native_call(net_sup, y_test.examples[:4]),
"Diff.Phys": math.native_call(net_dp, y_test.examples[:4]),
}, channel('curves'))
vis.plot(sample_trajectory(*predictions.vector), size=(16, 4))
```

/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/phiml/math/_shape.py:1957: RuntimeWarning: Stacking shapes with incompatible item names will result in item names being lost. For vector Got ('pos', 'height', 'vel', 'angle') and None warnings.warn(f"Stacking shapes with incompatible item names will result in item names being lost. For {name} Got {item_names[index]} and {items}", RuntimeWarning) /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/phi/field/_field.py:142: FutureWarning: Instance checks on Grid are deprecated and will be removed in version 3.0. Use the methods instance.is_grid, instance.is_point_cloud, instance.is_centered and instance.is_staggered instead. return isinstance(self, Grid)

Out[16]:

<Figure size 1600x400 with 4 Axes>