Invited Talks

Invited speakers will give technical talks about their research in computer vision.

Chelsea Finn

Assistant Professor, Computer Science and Electrical Engineering, Stanford University

About Chelsea: Chelsea is an Assistant Professor in Computer Science and Electrical Engineering at Stanford University. Finn's research interests lie in the capability of robots and other agents to develop broadly intelligent behavior through learning and interaction. To this end, her work has included deep learning algorithms for concurrently learning visual perception and control in robotic manipulation skills, inverse reinforcement methods for scalable acquisition of nonlinear reward functions, and meta-learning algorithms that can enable fast, few-shot adaptation in both visual perception and deep reinforcement learning. Finn received her Bachelor's degree in Electrical Engineering and Computer Science at MIT and her PhD in Computer Science at UC Berkeley. Her research has been recognized through the ACM doctoral dissertation award, an NSF graduate fellowship, a Facebook fellowship, the C.V. Ramamoorthy Distinguished Research Award, and the MIT Technology Review 35 under 35 Award, and her work has been covered by various media outlets, including the New York Times, Wired, and Bloomberg. Throughout her career, she has sought to increase the representation of underrepresented minorities within CS and AI by developing an AI outreach camp at Berkeley for underprivileged high school students, a mentoring program for underrepresented undergraduates across four universities, and leading efforts within the WiML and Berkeley WiCSE communities of women researchers.

Title: Generalization in Visuomotor Learning

The recorded talk by Chelsea Finn for WiCV 2020 can be seen here. If you get a 'couldn't preview file' error, please click the pop-out icon on top right of the video to play in new tab.

Chelsea-wicv_june2020_final.mp4

Kavita Bala

Chair, Department of Computer Science, Cornell University.

Research Advisor, Facebook

About Kavita: Kavita Bala is the Chair of the Computer Science Department at Cornell University. She received her S.M. and Ph.D. from MIT, her B.Tech. from IIT (Bombay), and co-founded GrokStyle (recently acquired by Facebook). Bala specializes in computer vision and computer graphics, leading research in recognition and visual search; physically-based scalable rendering; material modeling and acquisition using physics and learning; and perception. Bala has authored 100+ peer-reviewed publications, the graduate-level textbook “Advanced Global Illumination”, and has served as the Editor-in-Chief of Transactions on Graphics (TOG), and as chair of SIGGRAPH Asia Papers in 2011. She is an ACM Fellow and recipient of the 2020 ACM SIGGRAPH Computer Graphics Achievement Award.

Title: Visual Understanding at Global Scale

Abstract: Augmented reality/mixed reality (AR/MR) is poised to create compelling and immersive user experiences by combining computer vision and computer graphics. Imagine users interacting with the world around them through their AR device. Visual search tells them what they are seeing, while computer graphics augments reality by overlaying real objects with virtual objects. AR/VR can have a far-ranging impact across many applications, such as retail, virtual prototyping, and entertainment. In this talk, I will describe my group’s research on these complementary areas: graphics models for realistic visual appearance, and visual search and fine-grained recognition for scene understanding. If time permits, I will talk about how these technologies can go beyond AR/VR applications to enable visual discovery—using recognition as a core building block; for example, by mining image collections at a global scale to discover visual patterns and trends across geography and time.

The recorded talk by Kavita Bala for WiCV 2020 can be seen here. If you get a 'couldn't preview file' error, please click the pop-out icon on top right of the video to play in new tab.

KavitaBalaWiCVCVPR2020.mp4

Georgia Gkioxari

Research scientist at Facebook AI (FAIR)

About Georgia: Georgia Gkioxari is a research scientist at Facebook AI Research (FAIR). She received a PhD in computer science and electrical engineering from the University of California at Berkeley under the supervision of Jitendra Malik in 2016. Her research interests lie in computer vision, with a focus on object and person recognition from static images and videos. In 2017, Georgia received the Marr Prize at ICCV for "Mask R-CNN".

Title: Beyond 2D Visual Recognition

Abstract: Undoubtedly 2D visual recognition has seen unprecedented success, with the state of the art advancing every single conference cycle. But as we develop sophisticated machines that predict 2D object silhouettes, we tend to ignore the fact that the world is not 2D and objects don't live in a 2D grid. This observation also translates to the realization of state-of-the-art recognition models, which are equipped by design to reason in 2D and operationally omit the third dimension. In this talk, I will present some of our efforts to marry the advances in 2D recognition with 3D reasoning, either by enhancing the predictive capability of recognition models with 3D outputs or using 3D as an intermediate representation. Lastly, I will introduce our new library PyTorch3D, which builds on PyTorch and is the foundation for our research in the intersection of 3D and deep learning.

The recorded talk by Georgia Gkioxari for WiCV 2020 can be seen here. If you get a 'couldn't preview file' error, please click the pop-out icon on top right of the video to play in new tab.

gkioxari_cvpr2020.mp4

Tali Dekel

Research Scientist at Google, Cambridge

About Tali: Tali Dekel is currently a Senior Research Scientist at Google, Cambridge MA, developing algorithms at the intersection of computer vision, computer graphics, and machine learning. She will join the Mathematics and Computer Science Department at the Weizmann Institute, Israel, as a faculty member (Assistant Professor) in 2021. Before Google, she was a Postdoctoral Associate at the Computer Science and Artificial Intelligence Lab (CSAIL) at MIT. Tali completed her PhD studies at the school of electrical engineering, Tel-Aviv University, Israel. Her research interests include computational photography, image/video synthesize, geometry and 3D reconstruction. Her awards and honors include the National Postdoctoral Award for Advancing Women in Science (2014), the Rothschild Postdoctoral Fellowship (2015), the SAMSON - Prime Minster's Researcher Recruitment Prize (2019), Best Paper Honorable Mention in CVPR 2019, and Best Paper Award (Marr Prize) in ICCV 2019. She served as workshop co-chair for CVPR 2020.

Title: Learning to Retime People in Videos

Abstract: By changing the speed of frames, or the speed of objects, we can enhance the way we perceive events or actions in videos. In this talk I will present two of my recent works on retiming videos, and more specifically, manipulating the timings of people’s actions. 1) “SpeedNet” (CVPR 2020 oral): a method for adaptively speeding up videos based on their content, allowing us to gracefully watch videos faster while avoiding jerky and unnatural motions. 2) “Layered Neural Rendering for Retiming People” (under review): a method for speeding up, slowing down, or entirely freezing certain people in videos, while automatically re-rendering properly all the scene elements that are related to those people, like shadows, reflections, and loose clothing. Both methods are based on novel deep neural networks that learn concepts of natural motion and scene decomposition just by observing ordinary videos, without requiring any manual labels. I’ll show adaptively sped-up videos of sports, of boring family events (that all of us want to watch faster), and I’ll demonstrate various retiming effects of people dancing, groups running, and kids jumping on trampolines.

The recorded talk by Tali Dekel for WiCV 2020 can be seen here. If you get a 'couldn't preview file' error, please click the pop-out icon on top right of the video to play in new tab.