high-risk research on hard problems in natural and artificial systems:
the constraints necessary to "build" a visual system
better articulated models of the intermediate and high-level vision
the organizational principles of visual cortex
how vision interacts with other modalities
tools drawn from cognitive science, machine learning, computer vision, computer graphics, and large-scale neuroimaging, including fMRI, DTI, MEG, EEG
My own (crazy) thoughts about some of our recent work: The organization of knowledge within visual cortex may be best characterized as a multimodal "operating system" for learning about, representing, and processing information relevant to adaptive behaviors. Put another way, the organization of visual cortex necessarily reflects and emerges from common adaptive behaviors of humans (e.g., food selectivity emerges because humans eat frequently and so on...). However, the consistency of organization for these functional structures across individuals suggests otherwise; instead, pointing towards a core set of constraints that enable a specific higher-order knowledge structure that is inherently anchored in real-world behaviors. Exploring these ideas further will require new ways of thinking about visual cortex.
news
π CONGRATULATIONS to Jayanth for successfully defending his PhD thesis "Rethinking object categorization in computer vision" (September 26, 2023).
π NEW RESEARCH: We introduce a new task, Labeling Instruction Generation, to address missing publicly available labeling instructions from large-scale, visual datasets. In Labeling Instruction Generation, we take a reasonably annotated dataset and: 1) generate a set of examples that are visually representative of each category in the dataset; 2) provide a text label that corresponds to each of the examples. We introduce a framework that requires no model training to solve this task and includes a newly created rapid retrieval system that leverages a large, pre-trained vision and language model. This framework acts as a proxy to human annotators that can help to both generate a final labeling instruction set and evaluate its quality. Our framework generates multiple diverse visual and text representations of dataset categories.
arXiv preprint: https://arxiv.org/abs/2306.14035
π CONGRATULATIONS to Aria for successfully defending her PhD thesis "Using Task Driven Methods to Uncover Representations of Human Vision and Semantics" (June 23, 2023). Aria will be moving to NIH this summer.
π NEW RESEARCH: We introduce a data-driven approach in which we synthesize images predicted to activate a given brain region using paired natural images and fMRI recordings, bypassing the need for category-specific stimuli. Our approach -- Brain Diffusion for Visual Exploration ("BrainDiVE") -- builds on recent generative methods by combining large-scale diffusion models with brain-guided image synthesis.
Accepted for NeurIPS 2023!
arXiv preprint: https://doi.org/10.48550/arXiv.2306.03089
π NEW RESEARCH: We've trained a neural network to predict brain responses to images, and then βdissectedβ the network to examine the selectivity of spatial properties across high-level visual areas. Discover more about our work: brain-dissection.github.io
Accepted for NeurIPS 2023!
Gabe's twitter thread: https://twitter.com/GabrielSarch/status/1663950775284801536?s=20
BioRxiv preprint: https://doi.org/10.1101/2023.05.29.542635
π NEW RESEARCH: Children typically learn the meanings of nouns earlier than the meanings of verbs. However, it is unclear whether this asymmetry is a result of complexity in the visual structure of categories in the world to which language refers, the structure of language itself, or the interplay between the two sources of information. We quantitatively test these three hypotheses regarding early verb learning by employing visual and linguistic representations of words sourced from large-scale pre-trained artificial neural networks.
arXiv preprint: https://arxiv.org/abs/2304.02492
π NEW PAPER: A texture statistics encoding model reveals hierarchical feature selectivity across human visual cortex. J Neurosci. JN-RM-1822-22. Link
π NEW PAPER: Low-level tuning biases in higher visual cortex reflect the semantic informativeness of visual features. J of Vis. 23(8). Link
π CONGRATULATIONS to Nadine for successfully defending her PhD thesis "Bridging the gap from human vision to computer vision" (April 25, 2023) Nadine has accepted a position at NVIDIA.
π NEW PAPER: Selectivity for food in human ventral visual cortex. Commun Biol. 6, 175. Link | github
π NEW PAPER: Why is human vision so poor in early development? The impact of initial sensitivity to low spatial frequencies on visual category learning. PLoS ONE. 18(1): e0280145. Link | github