research on natural and artificial systems:
the constraints necessary to "build" a visual system
better articulated models of the intermediate and high-level vision
the organizational principles of visual cortex
how vision interacts with other modalities
tools drawn from cognitive science, machine learning, computer vision, computer graphics, and large-scale neuroimaging, including fMRI, DTI, MEG, EEG
I will not be taking new trainees for 25-26
but CMU just hired Maggie Henderson, Jenelle Feather,
Jonathan Tsay, Xaq Pitkow, and Aran Nayebi
You should reach out to them!
news
Β―\_(γ)_/Β― NEW FEATURE: We have a custom lab GPT trained on the website and some of our papers. It appears to provide pretty solid summaries on all sorts of things related to our work. Try it out here: tarrlabGPT
π NEW RESEARCH: BrainSAIL! Brain Mapping with Dense Features: Grounding Cortical Semantic Selectivity in Natural Images With Vision Transformers by Andrew Luo. arXiv
π CONGRATULATIONS to Andrew Luo for successfully defending his PhD thesis "Computational Exploration of Higher Visual Selectivity in the Human Brain" (September 26, 2024)!
π NEW NEURIPS PAPER: Divergences between Language Models and Human Brains by Yuchen Zhou. arXiv
π NEW NEURIPS PAPER (Spotlight): VLM Agents Generate Their Own Memories: Distilling Experience into Embodied Programs of Thought by Gabe Sarch. arXiv (more memorable title here than on arXiv)
π CONGRATULATIONS to lab member Yuchen Zho, who has accepted a research position at META AI starting in summer of 2025!
π CONGRATULATIONS to lab member Andrew Luo, who has accepted a faculty position at the University of Hong Kong as of mid-Fall 2024!
π CONGRATULATIONS to lab member Maggie Henderson, who will be starting a faculty position in Psychology and the Neuroscience Institute as of Fall 2024!
π CONGRATULATIONS to lab alumnus Isabel Gauthier who is the twelfth recipient of the Davida Teller Award, for her many contributions to vision science, including the role of expertise in object recognition and her strong history of mentoring!
NEW FUN PAGE: Visualizing mixed metaphors -- share your best ones with us!
π NEW PAPER: Better models of human high-level visual cortex emerge from natural language supervision with a large and diverse dataset. Nat Mach Intell. paper | supplemental | non-paywalledΒ
π NEW RESEARCH: We introduce a data-driven method that generates natural language descriptions for images predicted to maximally activate individual voxels of interest. Our method -- Semantic Captioning Using Brain Alignments ("BrainSCUBAβ’") -- builds upon the rich embedding space learned by a contrastive vision-language model and utilizes a pre-trained large language model to generate interpretable captions. We validate our method through fine-grained voxel-level captioning across higher-order visual regions. We further perform text-conditioned image synthesis with the captions, and show that our images are semantically coherent and yield high predicted activations. Finally, to demonstrate how our method enables scientific discovery, we perform exploratory investigations on the distribution of "person" representations in the brain, and discover fine-grained semantic selectivity in body-selective areas. Unlike earlier studies that decode text, our method derives voxel-wise captions of semantic selectivity. Will be presented at the International Conference on Learning Representations (ICLR).
arXiv preprint: https://doi.org/10.48550/arXiv.2310.04420
π CONGRATULATIONS to Jayanth Koushik for successfully defending his PhD thesis "Rethinking object categorization in computer vision" (September 26, 2023).
π NEW RESEARCH: We introduce a new task, Labeling Instruction Generation, to address missing publicly available labeling instructions from large-scale, visual datasets. In Labeling Instruction Generation, we take a reasonably annotated dataset and: 1) generate a set of examples that are visually representative of each category in the dataset; 2) provide a text label that corresponds to each of the examples. We introduce a framework that requires no model training to solve this task and includes a newly created rapid retrieval system that leverages a large, pre-trained vision and language model. This framework acts as a proxy to human annotators that can help to both generate a final labeling instruction set and evaluate its quality. Our framework generates multiple diverse visual and text representations of dataset categories.
arXiv preprint: https://arxiv.org/abs/2306.14035
π CONGRATULATIONS to Aria for successfully defending her PhD thesis "Using Task Driven Methods to Uncover Representations of Human Vision and Semantics" (June 23, 2023). Aria will be moving to NIH this summer.
π NEW RESEARCH: We introduce a data-driven approach in which we synthesize images predicted to activate a given brain region using paired natural images and fMRI recordings, bypassing the need for category-specific stimuli. Our approach -- Brain Diffusion for Visual Exploration ("BrainDiVEβ’") -- builds on recent generative methods by combining large-scale diffusion models with brain-guided image synthesis.
Accepted for NeurIPS 2023!
arXiv preprint: https://doi.org/10.48550/arXiv.2306.03089
π NEW RESEARCH: We've trained a neural network to predict brain responses to images, and then βdissectedβ the network to examine the selectivity of spatial properties across high-level visual areas. Discover more about our work: brain-dissection.github.io
Accepted for NeurIPS 2023!
Gabe's twitter thread: https://twitter.com/GabrielSarch/status/1663950775284801536?s=20
BioRxiv preprint: https://doi.org/10.1101/2023.05.29.542635
π NEW RESEARCH: Children typically learn the meanings of nouns earlier than the meanings of verbs. However, it is unclear whether this asymmetry is a result of complexity in the visual structure of categories in the world to which language refers, the structure of language itself, or the interplay between the two sources of information. We quantitatively test these three hypotheses regarding early verb learning by employing visual and linguistic representations of words sourced from large-scale pre-trained artificial neural networks.
arXiv preprint: https://arxiv.org/abs/2304.02492
π NEW PAPER: A texture statistics encoding model reveals hierarchical feature selectivity across human visual cortex. J Neurosci. JN-RM-1822-22. Link
π NEW PAPER: Low-level tuning biases in higher visual cortex reflect the semantic informativeness of visual features. J of Vis. 23(8). Link
π CONGRATULATIONS to Nadine for successfully defending her PhD thesis "Bridging the gap from human vision to computer vision" (April 25, 2023) Nadine has accepted a position at NVIDIA.
π NEW PAPER: Selectivity for food in human ventral visual cortex. Commun Biol. 6, 175. Link | github
π NEW PAPER: Why is human vision so poor in early development? The impact of initial sensitivity to low spatial frequencies on visual category learning. PLoS ONE. 18(1): e0280145. Link | github