Mutual benefits of cognitive and computer vision:
How can we use one to understand the other?
State-of-the-art computer vision systems have benefited greatly from our understanding of the human visual system, and research in human vision has benefited from image processing and modeling techniques from computer vision. Current advances in machine learning (e.g., deep learning) have produced computer vision systems which rival human performance in various narrowly-defined tasks such as object classification. However, the biological vision system remains the gold standard for efficient, flexible, and accurate performance across a wide range of complex real-world tasks. We believe that close collaboration between the fields of human and computer vision will lead to further breakthroughs in both fields. The goal of this workshop is to investigate the relationships between biological and computer vision, and how we can use insights from one to better understand the other.
Our workshop will broadly address relationships between biological vision and computer vision, but questions of particular interest include: 1) How does the concept of "attention'' in computer vision relate to processing in the human visual attention system? 2) How do the features used by humans to represent objects and scenes compare to the features learned by artificial deep networks to perform large-scale image classification? And 3) Should computer vision models be designed after the primate visual system? The goal of the workshop is to foster a level of understanding that is more than the sum of its questions, one where unifying principles emerge that shape new questions and new directions for seeking answers.
For the first time in human history it is possible to build computational systems that start to parallel, at least remotely, the complexity of the primate visual system. This technological advance provokes the question of whether computer vision, or a substantial subfield thereof, should exploit the vast knowledge that now exists of brain organization and functional architecture to create truly brain-inspired models. Using attention as one recent example, computer vision has really appealed to this concept only as a metaphor; there has been no real attempt to model attention as it exists in the brain. And there are good reasons not to do so. The primate visual system is ripe with limitations that, from a computer vision perspective, might be viewed as unnecessary constraints. But is there potential value in enduring these short-term limitations in order to reach long-term benefits, the creation of a cognitively-flexible visual system? This is a debate that we need to have.
- Similarities and differences in architectures for processing visual information in the human brain and computer vision systems
- Neural network architectures having an effective, efficient, and "human-like'' attentional mechanism.
- Identifying core limitations of existing computer vision/deep learning systems compared to human vision
- Brain-engineered deep neural networks
- Learning rules used in computer vision and by the brain (e.g. reinforcement and inverse reinforcement learning, unsupervised/semi-supervised learning, Hebb rule, STDP)
- Comparisons of feature representations in human and computer vision
- What are meaningful tasks and metrics to compare human and computer vision performance (e.g. eye fixations during visual search)?
- New datasets for learning and predicting human behavior and neural responses from human (fMRI) and primate (cell recording)
- Machine representations that can generalize like humans do (or even close)
- Modeling attention control and predicting human attention