Multi-Task Dynamical Systems

This site provides some companion animations for the paper "Multi-Task Dynamical Systems" by Bird, Williams and Hawthorne. See below for the graphical model.

This page covers two sets of experiments - the motion capture (mocap) work detailed in the paper, and some additional unpublished outputs from application to double pendulum video data.

Motion Capture (Mocap) Animations

In-sample predictions

For the first video, we provide some examples of the in-sample performance so the viewer is familiarized with the dataset and the models. The trajectory over the next second (which comprises most of the input information) is shown along the ground. We demonstrate that both our MTDS model and the competitor model (derived from Martinez et al., 2017) show close to optimal performance in the training set.

Experiment 1: limited data

In this video we show the performance of both the MTDS model and the competitor model when trained on small amounts of data. We show three styles for models trained on 6.7% of the training data and 13.3% of the training data. A comparison of the aggregate MSE is shown for each model.

Experiment 2: novel test styles

In this video we show the performance of the models when presented with a style at test time that has not previously been observed. Neither model is able to replicate the performance of the in-sample predictions, but notice that the MTDS model captures the style of the lower half fairly well, and crucially avoids any degradation over longer term prediction. A comparison of the aggregate MSE is shown for each model, as well as some notes in the competitor (middle) panel to explain the major differences between the predictions.

Experiment 3: style morphing

Here we show the effect of moving about in z space on predictions by linearly interpolating between the latent variables associated with each style. In these videos, the middle panel will show the result of this interpolation, with the panels either side showing the source style and target style respectively. The interpolation is quite subtle (as might be hoped from smooth style interpolation) so careful viewing is recommended.

The following video shows the same morphing sequence, but continuously, without title frames or comparison panels.

Double Pendulum Animations

This section explores the capability of our MTDS formulation to predict a variety of future trajectories. Unlike standard stochastic RNN-like approaches, we do not sample a new latent variable at each time step, which facilitates long term correlations in the predictive rollouts. This section is no longer part of the paper, as we discovered it to be similar to some existing work, most notably Babaeizadeh et al. (2018). While our approach is more general (modulating the parameters rather than simply the bias), for this application we observed similar results for both approaches. We leave the results here for general interest.

MT-GRU vs. ground truth

Comparing the predicted sequence from our multi-task model with the ground truth (GT) on four examples from the test set. Differences are shown in the 'Delta' box: dark areas represent pixels the position of the GT bobs, and light pixels represent the position of the MT-GRU bobs. Instantaneous negative log likelihood (NLL) is also shown. Observe that this a high loss may be achieved for a variety of reasons.

Some configurations with the green bob above the red bob become highly sensitive to the relative positions and velocities of the bobs. This is where chaotic dynamics are especially prevalent -- you can see the predictions 'breaking down' in the final example as the cost of representing this uncertainty becomes too high.

Adjusting predictions via use of latent z.

Here we seed the MT-GRU with the same initial sequence (same x_0) and choose three different values of z after the first 10 frames. The resulting predictions are shown in the three panels below. Note that the initial 10 frames (where all predictions are identical) are denoted by a shift in the background colour. In order to make the differences in predictions clearer, we have drawn in the "tails" of the bobs to make viewing easier: these are not present in the raw predictions.

Example 1: (as for all examples, each panel below begins with the same x_0)

Example 2:

Smooth interpolation between predictions

We demonstrate that the predictions above can be smoothly manipulated via the latent z. The below shows a grid of predictions, each with a different latent z, where the horizontal axis denotes the value of z_1, and the vertical axis denotes the value of z_2 (as per the paper). As above, within each video, each sequence has the same x_0 (which in fact correspond to the x_0s in the examples above). The horizontal direction (z_1) corresponds approximately to the energy of the green bob (right is lowest energy) and the vertical direction (z_2) corresponds to decreasing system energy (top is lowest energy).

It may be instructive to pause the video and consider the relative positions of the bobs across each of the frames. Due to the smooth interpolation, observing differences can require a little care.

Example 1:

Example 2:

References

Babaeizadeh, M., Finn, C., Erhan, D., Campbell, R. H., & Levine, S. (2017). Stochastic Variational Video Prediction. In Proceedings of the International Conference on Learning Representations (ICLR), 2018. arXiv preprint arXiv:1710.11252.,