Invited Speakers

ABSTRACT

Scientific fields that are interested in faces have developed their own sets of concepts and procedures for understanding how a target model system (be it a person or algorithm) perceives a face under varying conditions. In computer vision, this has largely been in the form of dataset evaluation for recognition tasks where summary statistics are used to measure progress. While aggregate performance has continued to improve, understanding individual causes of failure has been difficult, as it is not always clear why a particular face fails to be recognized, or why an impostor is recognized by an algorithm. Importantly, other fields studying vision have addressed this via the use of visual psychophysics: the controlled manipulation of stimuli and careful study of the responses they evoke in a model system. In this talk, we suggest that visual psychophysics is a viable methodology for making face recognition algorithms more explainable. A comprehensive set of procedures is developed for assessing face recognition algorithm behavior, which is then deployed over state-of-the-art convolutional neural networks and more basic, yet still widely used, shallow and handcrafted feature-based approaches.

Women also snowboard: Diagnosing and correcting bias in captioning models


Dynamic Meta-knowledge Creation from Data for Understanding, Data Generation and Decision Making

ABSTRACT

A central goal of artificial intelligence is to understand and reason about complex, real-world environments, humans and their activities and enable reliable and efficient decision making. To address this problem we have been developing a general, scalable, computational framework that combines principles of machine learning, sparse methods, mixed norms, AI, dictionaries, and deformable modeling methods. In this talk we will present new machine learning methods for modeling and understanding the relationship between face, body, and scenes from simple camera inputs. We present methods for object detection, segmentation, and classification and the use of GANs and self-supervised learning that can be used for many applications such as monocular multi-view image generation, attribute editing and retargeting, and image to video translations for data augmentation and storytelling