12:00 PM - 12:10 PM Welcome
12:10 PM - 1:00 PM Invited speaker: Dr. Federico Tombari
- 3D Object Understanding with Less Annotation
1:00 PM - 1:50 PM P1 - Paper session 1
- 1:00 - 1:10 Towards generalization of Human Pose estimation in the wild. [Presentation]
- 1:10 - 1:30 A novel joint points and silhouette-based method to estimate 3D human pose and shape. [Presentation]
- 1:30 - 1:50 3D Human Pose Estimation based on Multi-Input Multi-Output CNN and Event Cameras: proof of concept on the DHP19 dataset. [Presentation]
1:50 PM - 2:00 PM Break
2:00 PM - 3:00 PM P2 - Paper Session 2:
- 2:00 - 2:20 Image-based Out-of-Distribution-Detector Principles on Graph-based Input Data in Human Action Recognition. [Presentation]
- 2:20 - 2:40 Pose Based Trajectory Forecast of Vulnerable Road Users Using Recurrent Neural Networks. [Presentation]
- 2:40 - 2:50 Space-Time Triplet Loss Network for Dynamic 3D Face Verification. [Presentation]
- 2:50 - 3:00 Subject Identification Across large expression variations using 3D facial landmarks. [Presentation]
3:00 PM - 3:50 PM Invited Speaker 2: Prof. Anup Basu
- Perceptually Guided transmission of 3D Humans
3:50 PM - 4:00 PM Closing
Federico Tombari
Abstract: 3D object understanding encompasses technique to reason about 3D shapes from images or 3D data. Common tasks are object classification, pose estimation and shape reconstruction/completion, which are now commonly tackled via deep learning. One limiting aspect is the need for extensive annotations required for training purposes, which are particularly hard to obtain especially in 3D. In this talk I will present some recent directions in 3D object understanding with a particular focus on requiring less annotations. The talk is divided into two main parts. In the first part, I will focus on learning feature embeddings from point cloud data. I will present a recent approach for learning informative embeddings for 3D object classification and completion, named SoftPoolNet. In a different work focused on point cloud classification and pose estimation based on 3D capsules, I will show how embedding particular properties such as rotation invariance can soften the data requirements at training time. The second part will explore recent trends for 6D object pose estimation in clutter. I will illustrate how self-supervised learning can be used to relax the need for annotated data, improving 6D pose estimation by leveraging unlabeled data at inference time.
Anup Basu
Abstract: Perceptual and biological motivation can help develop better and more robust image processing and computer vision algorithms. More specifically this can help in robust multi-camera motion estimation, active camera calibration, foveated image/video/3D compression, and the role of spatially varying sensing in 3D perception and depth reconstruction. Perceptual factors can also guide us to better compression and transmission of 3D Humans. For example, Motion Capture (MoCap) data for humans can be optimally compressed considering the bone lengths relative to motion and joints in contact with a surface; packet loss during mesh transmission can be better protected against with stripification and interleaving; a 4D hierarchical time series and Just Noticeable Distance (JND) can support fast and interactive visualization of dynamic point clouds of human models; and even loss in MoCap packets can be addressed using an interleaved LPDC algorithm.