Zach Chavis

Visual Portfolio

Hello!

I'm a CSci PhD candidate advised by Dr. Stephen J. Guy and Dr. Hyun Soo Park, applying computer vision, motion planning, and deep learning techniques to computer graphics, robotics, and human motion understanding.

My latest research has focused on egocentric vision, natural language, and human motion.

Currently spending the summer exploring multi-agent vision and quadrupedal field robotics in Minnesota :-)

Contact me at chavi014 at umn.edu

See Research

Me, talking about Ego-Exo4D at CVPR 2024!

Recent Events:

Used a simple pre-processing technique to improve egocentric keystep recognition without altering baseline model architecture. Accepted to EgoVis Workshop at CVPR 2025! Project Page.
Learned to predict spatial locations of novel tasks in natural language from ego-video demonstrations. Accepted to ICRA 2025! Project Page.
Improved foundation models' ability to plan over complex scenes for robot navigation and manipulation. Accepted at FMNS Workshop at ICRA 2025! Project Page.
Improved performance on first-person activity classification and localization benchmarks in Ego-Exo4D. In Submission.
Characterized reaching motion strategies across childhood from multi-view videos. Preprint here.

Events:

Contributed to a dataset of 1300 hours of skilled human activities from Ego and Exo viewpoints alongside Meta -- Ego-Exo4D! Accepted at CVPR 2024! Paper here.
Interned at Meta FAIR (2022), and used annotated cleaning demonstrations to improve mobile manipulators in home-cleaning tasks!
Also brought Spot to Habitat ;-)
Created a dataset of roughly 75K first-person image-trajectory pairs for future-trajectory planning for the Ego4D project with Meta! Accepted at CVPR 2022! Paper here.

Selected Research

Simultaneous Localization and Affordance Prediction

(Accepted ICRA '25; Project Page)

We leveraged Ego-Exo4D demonstrations to augment VLMs in two ways: through understanding spatial task-affordances, and the localization of that task relative to the egocentric viewer. We then demonstrate this system on a simulated robot.

Ego-Exo4D

(CVPR 2024; arXiv)

A massive-scale exo+egocentric video-language dataset and benchmark suite for skilled activities.
My contributions included assisting with dataset standardization, on-site data collection, and annotation.

Improving Robotic Home Assistants through Human Pose Prediction

(Work done for Meta internship)

By using a small dataset of human poses, we're able to learn a geometry-aware pose prediction network which is used to augment the reward function for reinforcement learning in AI Habitat. Our system improved robot efficiency over SotA for house-cleaning tasks.

Selected Coursework

Reinforcement Learning for Optimal Control on GPU

Implemented an optimization-based (sub-gradient descent) controller for steering a virtual non-holonomic vehicle toward an arbitrary goal.
Used a neural-network trained via Reinforcement Learning to output controls to steer a non-holonomic vehicle toward an arbitrary goal
Optimized RL agent using gradient-free Cross-Entropy Method, implemented from scratch in CUDA

Vulkan Deferred Renderer

(Collaboration with Liam Tyler, )

Deferred Rendering, with SSAO, bit packing, and normal mapping with octahedral encoding.

5561_Final_Project_Final.pdf

3D Social Saliency Reconstruction

(click the icon in the top right of widget to read)

Multi-person reconstruction and collaboration estimation from multi-view camera setup.

Character Animation

Basic animated character controller in Unity with FABRIK implementation to hold a simple weapon.

Stylized character animation in Blender. Design, Modeling, Rigging, and Animation by me.

Frustum Culling and LOD to render billions of triangles at reasonable framerates!

Cloth simulation (Need to reimplement with ADMM and CUDA!)

Report abuse