Weekly Gathering for

 Vision and Graphics Researchers and Enthusiasts

12:00 - 1:00 pm (ET),  Star Room (32-D463) or Zoom

Check out MIT Visual Computing Seminar mailing list ( Zoom link and sign-up spreadsheet) and YouTube Channel for further information.

MIT Vision and Graphics Seminar Mailing ListMIT Vision and Graphics Seminar YouTube ChannelMIT CSAIL

Most Recent Talks:

Title: Geometric Fields: Animation Beyond Meshes

Speaker: Ana Dodik

Time: Apr 16th, 12-1pm ET, 2024


Abstract: Despite a flurry of published papers on 3D data processing, modern tools such as Blender and Maya rely primarily on methods from over a decade ago. These tools rely on the finite-element method (FEM) to optimize various smoothness objectives. FEM-based algorithms---while providing strong guarantees necessary for disciplines like civil engineering---beget opaque and brittle pipelines in computer graphics. They put the burden on the end-user to ensure that 3D meshes are ``well-behaved'', requiring the user to e.g. remove self-intersections or to ensure obscure mathematical properties such as manifoldness, water-tightness, or suitable interior angle bounds. My research uses techniques inspired by modern machine learning tools, not for data-driven learning, but as the computational domain for problems in geometry processing that is agnostic of the shape representation and its quality, but is nonetheless aware of its geometry.  These new mesh-free representations---geometric fields---allow our algorithms to focus on robustness, user control, and interactivity. This talk focuses on applications of geometric fields to problems in shape deformation and animation.


Speaker bio: Ana Dodik is a PhD student at MIT CSAIL working on neural representations for geometry processing. Prior to joining MIT, she spent two years developing next-generation virtual presence at Meta. She graduated with a Master’s degree from ETH Zurich, where she spent a year collaborating with Disney Research Studios on problems at the intersection of machine learning and offline rendering.

Coming Next:

Title: Sora: Video Generation Models as World Simulators

Speaker: Tim Brooks, OpenAI

Time: Apr 23rd, 12-1pm ET, 2024


[MIT CSAIL Event]


Abstract: We explore large-scale training of generative models on video data. Specifically, we train text-conditional diffusion models jointly on videos and images of variable durations, resolutions and aspect ratios. We leverage a transformer architecture that operates on spacetime patches of video and image latent codes. Our largest model, Sora, is capable of generating a minute of high fidelity video. Our results suggest that scaling video generation models is a promising path towards building general purpose simulators of the physical world.


Speaker bio: Tim Brooks is a research scientist at OpenAI where he co-leads Sora, their video generation model. His research investigates large-scale generative models that simulate the physical world. Tim received a PhD at Berkeley AI Research advised by Alyosha Efros, where he invented InstructPix2Pix. He previously worked on AI that powers the Pixel phone's camera at Google and on video generation models at NVIDIA.

MIT Visual Computing Seminar Weekly Schedule:

Seminar Organizers: