Paper: https://arxiv.org/abs/2511.15684
Code: https://github.com/PolymathicAI/walrus
Blog: https://polymathic-ai.org/blog/walrus/
Attribution for Walrus background goes to Getty images and is licensed via Unsplash+.
This project page is intended to show case simulation rollouts generated by Walrus, the new multi-domain foundation model for continuum dynamics released by Polymathic. It was trained on 19 different physical scenarios spanning 63 physical variables in both 2 and 3D. Architecturally, Walrus is fundamentally just a large factorized space/time transformer using an encoder-processor-decoder structure, but takes advantage of a few new tricks to improve rollout stability, flexibility, and training throughput. For the solutions seen below, the most relevant are:
Patch Jittering
Walrus suppressed the growth of long-run instabilities through the use of patch jittering. Patch jittering involves randomly translating the reference frame (with padding for boundaries) before each step. While the paper goes into more theoretical detail on why this works, the core idea is that the specific downsampling pattern leads to predictable accumulation of error and that randomizing this process can help alleviate this pathology.
Adaptive Compute
To handle varying compute budgets and problem complexities, we also employ stride modulation to allow users to adjust their downstream resolution. During pretraining, this was used to keep internal resolution fairly consistent (32/33 per dim in 2D, 16/17 in 3D). In this approach, the downsampling layers of the encoder/decoder will dynamically adjust their stride based on a target internal resolution.
Combining these two tools lets us produce the stable, highly accurate solutions we visualize below (either full 2D or central slices of 3D).
(Aspect ratio will look distorted until the video is played)