Featured Latest... 🌱🌷🍃

💎 FoFPred (Future Optical Flow Prediction): Predicting future optical flow turns out to be surprisingly powerful; it boosts both robot control 🤖 and video generation 🎥. I'm thrilled to share the release of 𝗙𝗼𝗙𝗣𝗿𝗲𝗱 (paper/code/checkpoint/interactive demo) 👉 𝗧𝗿𝘆 𝗶𝘁 𝘆𝗼𝘂𝗿𝘀𝗲𝗹𝗳! 👏 Huge shoutout to Kanchana Ranasinghe! See the post & video on LinkedIn / X. 😉

💎 Robotic VLA: VLAs can't just mimic expert trajectories — they need 𝗽𝗿𝗲𝗱𝗶𝗰𝘁𝗶𝘃𝗲 𝗺𝗼𝘁𝗶𝗼𝗻 𝗿𝗲𝗮𝘀𝗼𝗻𝗶𝗻𝗴. Our new work shows that jointly learning motion prediction via optical flow image diffusion gives 𝗥𝗼𝗯𝗼𝘁𝗶𝗰 𝗩𝗟𝗔𝘀 superior ability to reason about what actions to take. The result: stronger, more reliable real-world manipulation - a 23% improvement in real-world performance. Great job, Yu! 👍

💎 Video agent: The old passive video-perception setup just doesn't make sense anymore. Grabbing all visual info once, with fixed granularity and no query awareness, is inefficient and overloads the model. So we built Active Video Perception (AVP) — an agentic, evidence-seeking framework that treats a video like an interactive environment that you actively explore in a goal-directed manner. Check out my LinkedIn post. Excellent work, Ziyang!