The 1st International Workshop on Human-Centered Vision and Media Technologies

*This image was created with the assistance of DALL·E 3.

Date

Friday, April 12, 2024, 12:30 - 18:30

Venue

Institute of Industrial Science, the University of Tokyo
Seminar Rooms An401/402 (Talk), An403, An404 (Poster)

Registration

Participation in the workshop is free, but please register for the workshop using the form below.
If the number of participants exceeds the room capacity, we may close the registration.

https://forms.gle/f6Y6TpLUfd56wg2H7 

Program

12:30 - 13:30 Invited Talk: Yuki M. Asano

13:30 - 15:00 Poster Session 1

15:00 - 16:00 Invited Talk: Pascal Mettes

16:00 - 17:30 Poster Session 2

17:30 - 18:30 Invited Talk: Xucong Zhang

Invited Speakers

Yuki M. Asano (University of Amsterdam)

Title: Self-Supervised Learning in the age of CLIP et al.

Abstract: I will talk about new developments in self-supervised learning that will form the core of the next generation of foundation models. First, I will talk about how pretraining on videos can enable models that outperform models such as DINO by leveraging temporality. Second, I will show how ideas from self-supervised learning can be leveraged to drastically reduce the amount of paired image-text data needed for essential vision-language models.

Bio: Yuki M. Asano is an assistant professor for computer vision and machine learning at the QUVA lab at the University of Amsterdam. Prior to this, he completed his PhD at the Visual Geometry Group (VGG) at the University of Oxford where he worked with Andrea Vedaldi and Christian Rupprecht. He has served as an AC for NeurIPS/CVPR/ECCV/ICCV and is the main organiser of the SSLWIN workshops at ECCV, BigMAC at ICCV and co-organises the SSL workshop at NeurIPS.

Pascal Mettes (University of Amsterdam)

Title: Hyperbolic Deep Learning

Abstract: From linear layers and convolutions to self-attention, deep learning is implicitly Euclidean. But should it be? In this talk, I will dive into hyperbolic geometry for deep learning. I will discuss what hyperbolic geometry is and what is different compared to Euclidean geometry. I will then outline the strong potential of hyperbolic deep learning, from learning hierarchical representations to uncertainty and robustness to out-of-distribution and adversarial samples. Lastly, I will show our ongoing efforts towards fully hyperbolic networks and how to get started in this field with our new hyperbolic learning software library.

Bio: Pascal Mettes is an assistant professor at the University of Amsterdam on the topic of knowledge-aware visual understanding. His research focuses on hyperbolic deep learning for computer vision. He received his PhD (2017) and was a postdoc (2018-2019) in computer vision at the University of Amsterdam and was previously affiliated with Columbia University (2016) and the University of Tübingen (2021). He organised the ICCV’21 workshop on Structured Representations for Video Understanding, the Netherlands Conference on Computer Vision 2022, and the ECCV’22 + CVPR’23 tutorials on Hyperbolic Representation Learning in Computer Vision.

Xucong Zhang (Delft University of Technology)

Title: Visually Humans Measurement 

Abstract: The natural interaction between humans and AI agents is critical as well as challenging for human-centered intelligent systems, such as personalized robots and virtual characters in AR/VR-based telepresence systems.  It significantly influences user acceptance of AI agents, which in turn determines the development of these AI technologies. For example, an intelligent social robot should recognize the intention of a person aided with audio and non-verbal cues, and react naturally like another human being to keep the engagement with the user. I aim to develop an approach for the natural human-AI interaction that accurately perceives human behavior and generates human-like responses to drive AI agents. The technical core of this project is the development of a holistic model to handle multiple behavior features including facial expression, eye gaze, body postures, hand gestures, and speech to faithfully reflect the subtle movements and audiovisual signals. 

Bio: Xucong Zhang is an assistant professor at TU Delft and an ELLIS society member. He was a postdoc researcher from 2018 to 2021 in the Advanced Interaction Technologies Lab at ETH Zurich, led by Prof. Otmar Hilliges. Xucong did my PhD research (summa cum laude) from 2013 to 2018 at Max Planck Institute for Informatics under the supervision of Prof. Andreas Bulling. Before that, He obtained my Master’s degree (2013) at Beihang University and my Bachelor’s degree (2010) from the Honors Program at China Agriculture University, China. The core research interest of Xucong is human-centered computing as developing techniques to sense and serve human users.

Poster Presentations

Poster Session 1

Poster ID: Presenter (Affiliation), Title

Poster Session 2

Poster ID: Presenter (Affiliation), Title


Organizers