First International Workshop on

Affective Understanding
in Video

at CVPR 2021

Workshop date: June 19th

Our workshop will have 3 blocks:
Poster session: 7am-9am PST, Invited speakers: 10am - 1pm PST, Challenge talks: 4pm - 6pm PST

Please contact us at auvi.workshop@gmail.com if you have any questions about the schedule.

Before our workshop, you can:

Watch pre-recorded talks: YouTube playlist of all pre-recorded talks at AUVi.
Submit questions for the discussion panels: Link to Google Form.

Block 1: Poster Session
7am - 9am PST | 10am - 12pm EST | 2pm - 4pm UTC
Accepted Papers
Attend via Gatherly link on Cadmium provided by CVPR.

Block 2: Invited Speakers
Talks: 10am - 12pm PST | 1pm - 3pm EST | 5pm - 7pm UTC

Live panel discussion: 12pm - 1 pm PST | 3pm - 4 pm EST | 7pm - 8 pm UTC
Attend via Zoom link on Cadmium provided by CVPR.
YouTube stream: https://www.youtube.com/watch?v=b7q0WHtyRLE

Block 3: EEV Challenge Talks
Challenge introduction & talks: 4pm - 5pm PST | 7pm - 8pm EST | 11pm - 12am UTC

Live panel discussion: 5pm - 6pm PST | 8pm - 9pm EST | 12am - 1am UTC
EEV Challenge
Attend via Zoom link on Cadmium provided by CVPR.
YouTube stream: https://www.youtube.com/watch?v=b7q0WHtyRLE

Speakers (Live panel from 12 pm to 1pm PST):

Aleix Martinez (Ohio State University)
Talk Title: Toward an AI Theory of Mind: Understanding people’s intent

Advances in AI require algorithms that can understand people’s emotions and intent. That is, we need to develop and AI theory of mind. In this talk, I will present a number of projects my research group has worked on to address this general goal. Specifically, I will first present the work we have completed on the interpretation of emotion from faces, bodies, and context. I will follow with a discussion on how the biomechanics of agents determine the intent of their actions. Finally, I will introduce the first algorithms that can answer hypothetical questions. Throughout, I will derive supervised and unsupervised methods as well as a new approach that allows developers to know whether their deep neural networks are learning to generalize or simply learning to memorize.

Daniel McDuff (Microsoft)
Talk Title: Seeing Inside Out: Computer Vision for Measuring Internal and External Affective Signals

Bio: Daniel McDuff is a Principal Researcher at Microsoft where he leads research and development of affective technology. Daniel completed his PhD at the MIT Media Lab in 2014 and has a B.A. and Masters from Cambridge University. Daniel’s work on non-contact physiological measurement helped to popularize a new field of low-cost health monitoring using webcams. Previously, Daniel worked at the UK MoD, was Director of Research at MIT Media Lab spin-out Affectiva and a post-doctoral research affiliate at MIT. His work has received nominations and awards from Popular Science magazine as one of the top inventions in 2011, South-by-South-West Interactive (SXSWi), The Webby Awards, ESOMAR and the Center for Integrated Medicine and Innovative Technology (CIMIT). His projects have been reported in many publications including The Times, the New York Times, The Wall Street Journal, BBC News, New Scientist, Scientific American and Forbes magazine. Daniel was named a 2015 WIRED Innovation Fellow, an ACM Future of Computing Academy member and has spoken at TEDx and SXSW. Daniel has published over 100 peer-reviewed papers on machine learning (NeurIPS, ICLR, ICCV, ECCV, ACM TOG), human-computer interaction (CHI, CSCW, IUI) and biomedical engineering (TBME, EMBC).

Chujun Lin (Dartmouth College)
Talk Title: How people infer others’ mental states and traits: Evidence for bidirectional causation

Humans understand other individuals by considering both their momentary mental states and enduring traits. These two human cognitive processes have independently inspired recent AI research on domains such as multi-agent systems and human-robot interactions. Here, we show that inferences of mental states and traits are not independent. Humans link these two forms of social information to gain deeper insights of other people. Specifically, I will first present two correlational studies that show how human participants make associated mental state frequency inferences and trait inferences in naturalistic contexts. Then I will present two causation studies that show how human participants make different mental state inferences about people who appear to have different traits, and different trait inferences about people who appear to experience different mental states across a range of situations.

Alan Cowen (Hume AI)
Talk Title: What facial expressions reveal about emotion: New insights from real-world data

Bio: Alan Cowen is an emotion scientist leading a scientific consortium and technology company called Hume AI and a former researcher at U.C. Berkeley and Google. His research uses computational methods to address how emotional behaviors can be evoked, conceptualized, parameterized, predicted, annotated, and translated, how they influence the course of social interaction and relationship-building, and how they bring meaning to our everyday, aesthetic, and moral lives. He has studied behavioral responses to tens of thousands of emotional stimuli from thousands of participants across multiple cultures, analyzed brain representations of emotion using fMRI and intracranial electrophysiology, investigated ancient sculptures, and used deep learning to measure facial expressions in millions of naturalistic videos from around the world.

EEV Challenge Top Teams (Live panel from 5 pm to 6pm PST):

Talk Title (top team): Temporal Convolution Networks with Positional Encoding for Evoked Expression Estimation

Van Thong Huynh, Soo-Hyung Kim, Guee-Sang Lee, Hyung-Jeong Yang
(Chonnam National University)

Talk Title (runner-up): Less is More: Sparse Sampling for Dense Reaction Predictions

Kezhou Lin (Zhejiang University), Xiaohan Wang (Zhejiang University), Zhedong Zheng (University of Technology Sydney), Linchao Zhu (University of Technology Sydney),
Yi Yang (Zhejiang University)

Talk Title: Multi-Granularity Network for Multimodal Affective Understanding in Video

Lin Wang, Baoming Yan, Xiao Liu, Chao Ban, Bo Gao
(Alibaba Group)

Talk Title: 3D CNN to predict Evoked Expressions from Video

Chen Zhang, Quan Sun,
Chen Chen
(OPPO Research Institute)