CVPR 2021 Workshop on 3D Vision and Robotics
June 19th, 2021
In recent years, there has been tremendous progress on 3D vision for analysis and understanding of 3D data, such as 3D semantic segmentation, 3D object detection and tracking. These advances have however not yet translated to significant progress in several fundamental challenges for the domain of robotics. Active perception in static and dynamic environments, inference of spatial relations in 3D scenes, activity recognition, and behavior prediction in real-world settings are a few examples of challenging robotics problems. To successfully tackle these problems, we should leverage the inherent 3D nature of the physical world, and apply deep learning approaches that learn 3D representations that are robust to input perturbation and generalize to real-world variations with high sample efficiency (e.g., transformation invariance). This workshop presents a timely opportunity to bring together researchers in computer vision, machine learning, and robotics communities together to discuss the unique challenges and opportunities in 3D vision for robotics.
Topics of Interest
Is 3D useful for robotics? What kind of 3D representations are useful for robotics?
How can a robot learn a 3D representation of its environment and relevant objects from raw sensory input under noisy sensors and actuators? What kind of machine learning algorithms are needed?
What is the right interface between 3D perception and planning & control?
What are the underexplored areas of 3D perception for robotics (e.g. instance recognition, few-shot learning)?
What is the role of 3D simulation for robotics?
Both robotics and 3D data are fields that are research areas with high barriers to entry. How can we enable researchers from other fields such as ML to more easily work in these areas?
Speakers and Talks
Sanja Fidler is an Associate Professor at University of Toronto, and a Director of AI at NVIDIA, leading a research lab in Toronto. Prior coming to Toronto, in 2012/2013, she was a Research Assistant Professor at Toyota Technological Institute at Chicago, an academic institute located in the campus of University of Chicago. She did her postdoc with Prof. Sven Dickinson at University of Toronto in 2011/2012. She completed her PhD in computer science at University of Ljubljana in 2010, and was a visiting student at UC Berkeley in the final year of her PhD. She has served as an Area Chair for multiple computer vision, machine learning and NLP conferences (CVPR, ICLR, EMNLP, ACCV), and as a Program Chair of 3DV’16. Her main research interests are object recognition, 3D scene understanding, and combining vision and language.
David Held is an assistant professor at Carnegie Mellon University in the Robotics Institute and is the director of the RPAD lab: Robots Perceiving And Doing. His research focuses on perceptual robot learning, i.e. developing new methods at the intersection of robot perception and planning for robots to learn to interact with novel, perceptually challenging, and deformable objects. David has applied these ideas to robot manipulation and autonomous driving. Prior to coming to CMU, David was a post-doctoral researcher at U.C. Berkeley, and he completed his Ph.D. in Computer Science at Stanford University. David also has a B.S. and M.S. in Mechanical Engineering at MIT. David is a recipient of the Google Faculty Research Award in 2017 and the NSF CAREER Award in 2021.
Kristen Grauman is a Professor in the Department of Computer Science at the University of Texas at Austin and a Research Scientist at Facebook AI Research. Her research in computer vision and machine learning focuses on visual recognition and search. Before joining UT Austin in 2007, she received her Ph.D. at MIT in computer science. She is a Sloan Fellow, a recipient of NSF CAREER and ONR Young Investigator awards, the 2013 PAMI Young Researcher Award, the 2013 IJCAI Computers and Thought Award, a Presidential Early Career Award for Scientists and Engineers (PECASE), a 2017 Helmholtz Prize computer vision “test of time” award, and the 2018 J.K. Aggarwal Prize from the International Association for Pattern Recognition. She and her collaborators were recognized with best paper awards at CVPR 2008, ICCV 2011, and ACCV 2016.
Manolis Savva is an Assistant Professor in the School of Computing Science at Simon Fraser University, and a Canada Research Chair in Computer Graphics. He completed his PhD at the Stanford graphics lab, advised by Pat Hanrahan. His research focuses on human-centric 3D scene analysis, 3D scene generation, and simulation for scene understanding. He has also worked in data visualization, grounding of natural language to 3D content, and in creating large-scale scene datasets for 3D deep learning.
Panel Discussion Video
Call for Abstracts
We solicit 2-4 page extended abstracts conforming to the official CVPR style guidelines. A paper template is available in LaTeX and Word. References will not count towards the page limit. The review process is double-blind. Submissions can include: late-breaking results, under review material, archived, or previously accepted work (please make a note of this in the submission).
Submission page: https://cmt3.research.microsoft.com/3DVR2021
Submission Deadline: April 21, 2021 (11:59 pm PST)
Papers Assigned to Reviewers: April 24, 2021 (11:59 pm PST)
Reviews Due: May 8, 2021 (11:59 pm PST)
Acceptance Decision: May 15, 2021 (11:59 pm PST)
Camera-Ready Version: May 29, 2021 (11:59 pm PST)
Please note the accepted contributions will be presented as spotlight talks in the workshop, and will be posted on the workshop website upon author approval.