Visual information processing needs to fulfill two important goals. On the one hand, there is a need to comprehend the entire visual input at multiple distinctive levels to have an informed and stable representation of our environment. On the other hand, visual processing needs be selective and adaptive, as only a fraction of the visual input may be useful to guide our thoughts and actions at a given moment (Xu, 2018 Annu Rev Vis Sci). My search has focused on these two key aspects of visual information processing using cutting edge functional magnetic imaging (fMRI) methods on humans. I have made several significant empirical and theoretical contributions regarding the involvement of the human posterior parietal cortex (PPC) in supporting the adaptive and online aspect of visual information processing critical to visual working memory (VWM), decision making and behavior. My work has also significantly advanced our understanding of visual processing at multiple levels of processing in the human occipito-temporal cortex (OTC) such as the representation of parts, objects and ensembles of objects. Together, my empirical results and theoretical analyses have revised the “what vs where/how” two-pathway distinction previously proposed to characterize visual processing in the OTC and PPC. I argue instead that such a two-pathway distinction better reflects the invariant vs adaptive aspects of visual processing in the primate brain. The presence of these two complementary visual processing systems captures the demands of visual processing, providing us with both a stable representation of the visual environment and allowing us to interact flexibly and efficiently with the world.
In more recent work, I also examined the nature of visual processing in convolutional neural networks (CNNs), evaluated CNN modeling as a viable scientific method to understand primate vision, and found that there are similarities as well as fundamental differences in visual processing between the brain and CNNs. This limits CNN modeling in its current state as a shortcut to understand primate vision.
To study the neural mechanisms mediating the adaptive aspect of visual information processing, I focus on the neural mechanisms supporting VWM, as VWM provides the critical buffer needed to retain information for the task at hand. In earlier behavioral studies, I documented the impact of object-based encoding and feature binding on VWM capacity (Xu, 2002 JEP:HPP; Xu, 2002 Percep Psych; Xu, 2006 Percep Psych). Applying similar manipulations to examine the human brain and measuring fMRI response amplitudes, I discovered dissociating roles of the human inferior intra-parietal sulcus (IPS) and superior IPS in determining VWM capacity (Xu and Chun, 2006 Nature; Xu and Chun, 2007 PNAS; Xu, 2007 J Neurosci; Xu, 2008 Neuroimage; Xu, 2009 J Cogn Neurosci; Xu, 2010 J Neurosci; Jeong and Xu, 2013 J Cogn Neurosci). Specifically, while the inferior IPS is involved in individuating and selecting a fixed number of four items among multiple competing items, the superior IPS participates in the encoding and storage of the selected items into VWM. This discovery has helped resolve a critical debate in VWM regarding whether storage capacity is fixed or variable. My results show that both can be true, depending on the processing stage. These results have led me to propose the neural object file theory (Xu and Chun, 2009 TICS), which argues that the same two-stage processing may occur in the brain whenever multiple items are competing for representation, with the inferior IPS involved in selecting objects based on spatial information and the superior IPS involved in encoding and retaining detailed object information. This neural object-file theory can explain various forms of capacity-limited processing in visual cognition beyond VWM and can account for a large number of existing human behavioral and patient findings.
Having linked the PPC to adaptive visual processing through its role in VWM, in more recent work using fMRI multi-voxel pattern analysis (MVPA), I have documented the direct representation of VWM content in the human superior IPS (Bettencourt and Xu, 2016 Nat Neurosci). The PPC is best known for its role in space, attention and action-related processing. The direct representation of visual information in the PPC poses a significant challenge to these existing views on the PPC. Further work shows that VWM representations in the superior IPS, but not those in early visual areas, track behavioral performance and are unaffected by the presence and predictability of distractors (Bettencourt and Xu, 2016 Nat Neurosci). These results, together with a thorough review of the literature, have prompted me to critically reexamine the sensory account of VWM storage, a view currently dominating the VWM literature. I show that the available evidence in human behavioral, fMRI, and transcranial magnetic stimulation studies as well as monkey neurophysiological studies does not support the sensory regions as playing central roles in VWM storage. Instead, existing evidence supports the PPC and the prefrontal cortex as the primary regions involved in VWM storage (Xu, 2017 TICS and Xu, 2018 TICS).
The PPC’s ability to directly represent visual information in VWM tasks during adaptive visual processing prompts me to thoroughly document the range of visual information that can be directed represented in the PPC. Across a variety of different experimental paradigms, I observe the representations of a diverse array of visual information in the human PPC using fMRI MVPA measures. They include low-, mid-, and high-level visual information such as orientation, shape, viewpoint invariant object identity, and object category (Xu and Jeong, 2015 Attention and Performance XXV; Bettencourt and Xu, 2016 Nat Neurosci; Jeong and Xu, 2016 J Neurosci; Jeong and Xu, 2017 J Cogn Neurosci; Vaziri-Pashkam and Xu, 2017 J Neurosci; Vaziri-Pashkam and Xu, in press Cereb Cortex; Vaziri-Pashkam, et al., in press J Cogn Neurosci). Moreover, I show that the representational similarity of the visual information held in the PPC is closely correlated with perception and behavioral performance (Bettencourt and Xu, 2016 Nat Neurosci; Jeong and Xu, 2016 J Neurosci). Visual representations in the PPC also exhibit tolerance to low-level image transformations such as position, size and spatial frequency, much like those in higher-level regions in the OTC (Vaziri-Pashkam and Xu, 2019 Cereb Cortex; Vaziri-Pashkam, et al., 2019 J Cogn Neurosci). Incorporating these findings as well as a survey of the broader human imaging and monkey neurophysiology literature, in a recent review I provide a detailed and comprehensive documentation of the range of visual information that can be directly represented in the PPC and their tolerance to image transformations (Xu, 2018 Annu Rev Vis Sci).
Despite the existence of visual representations in both the OTC and the PPC, as described earlier, my research on VWM shows that visual representations in the PPC exhibits an overall greater resistance to distraction than those in the OTC (Bettencourt and Xu, 2016 Nat Neurosci). By directly manipulating attention and task, I further show that visual representations in the PPC are under greater attention and task control than those in the OTC (Vaziri-Pashkam and Xu, 2017 J Neurosci; Jeong and Xu, 2017 J Cogn Neurosci; see also Xu, 2010, J Neurosci, Jeong and Xu, 2013, J Cogn Neurosci). Together, these results highlight the adaptive nature of visual representation in the PPC and support a two-pathway separation between the OTC and the PPC based on the invariant vs adaptive aspect of visual information processing, rather than the “what” vs “where/how” distinction previously proposed (Xu, 2018, Annu Rev Vis Sci).
The direct representation of a diverse array of visual information in the PPC is incompatible with the three major functions previously ascribed to the PPC focusing on spatial, attentional, or action-related processing. Rather than further contributing to an already fragmented literature regarding the precise function of the PPC, in another recent review, by examining functional and anatomical evidence, I show that visual representation is both distinct and at the same time closely interacting with these three major PPC functions. I argue that we can bring convergence among these distinct PPC functions through PPC’s role in adaptive visual processing and form a structured, integrated and unified understanding of PPC’s role in vision, cognition and action (Xu, 2018, TINS). Such a framework is not only capable of accommodating a large body of existing PPC findings and bridging disparate lines of PPC research, but it can also provide critical guidance to future studies on the PPC and better connect them to the existing PPC literature.
To fully understand the adaptive nature of visual processing in the PPC, one also needs to understand the processing of the different types of visual information in the OTC, as the PPC receives a large amount of input from the OTC during adaptive visual processing (see Xu, 2018, Annu Rev Vis Sci for a review of the supporting evidence). To that end, I have made several discoveries regarding the neural underpinnings involved at distinctive levels of visual information processing in the OTC.
Single objects. At the single object level, I have studied the neural mechanisms specialized for face processing in the OTC. By examining the part-whole relationship critical for object recognition in the OTC using fMRI MVPA, I have unveiled a previously unknown neural impairment on face configural processing in individuals with developmental prosopagnosia (DPs) (Zhang et al., 2015, J Neurosci). This impairment is directly linked to DPs’ behavioral deficit in face processing and helps solve a long-standing mystery in face research. Using fMRI response amplitude measures and MVPA as well as magnetoencephalography (MEG) measures, I have also documented the interaction between visual expertise and the neural mechanisms mediating face processing in the OTC (Xu, 2015, Cereb Cortex; Xu, et al., 2015, Neuropsychologia; Ross et al., 2018, J Cogn Neurosci).
Multiple objects. In real world perception and action, the selection and representation of multiple visual objects are often critical. At the multiple object level, besides my VWM work unveiling the importance of PPC mechanisms in selecting and representing multiple visual objects in visual cognition (Xu and Chun, 2006 Nature; Xu and Chun, 2007 PNAS; Xu and Chun, 2009 TICS; Xu, 2007 J Neurosci; Xu, 2008 Neuroimage; Xu, 2009 J Cogn Neurosci ; Xu, 2010 J Neurosci; Jeong and Xu, 2013 J Cogn Neurosci), using fMRI MVPA, I have also examined how multiple objects may be represented in the OTC. Specifically, I describe how contextual association between a pair of objects and the layout of several objects may be directly represented in the OTC (Wang and Xu, in preparation).
Object ensembles. In everyday vision, we can simultaneously represent a large number of objects (object ensembles), such as the people in a crowd and the trees in a forest, by extracting summary statistics from such object ensembles without forming detailed representations of the individual objects comprising the ensemble. Object ensemble representation thus complements the capacity-limited but detailed representation of single and multiple objects. Despite its importance in visual perception, the neural mechanisms mediating object ensemble representation is vastly understudied. Using fMRI adaptation, I have documented the unique role of the anterior-medial ventral visual cortex in object ensemble representation and the ensemble features that this brain region may encode (Cant and Xu, 2012 J Neurosci; Cant and Xu, 2015 Cereb Cortex; Cant and Xu, 2017 J Cogn Neurosci; Cant et al., 2015 J Vis). Given this brain region’s additional involvement in texture and scene representation, I argue that the anterior-medial ventral visual cortex may play a general role in summary statistics representation that can account for its role in texture, ensemble and scene representation (Cant and Xu, 2012 J Neurosci; Cant and Xu, 2015 Cereb Cortex; Cant and Xu, 2017 J Cogn Neurosci).
My current and future research aims to further our understanding of both the invariant and the adaptive nature of visual information processing in the human brain. Overall, my empirical findings and theoretical contributions have brought fundamental insights regarding the mechanisms and algorithms our brain uses to select and represent visual information to guide thoughts, decisions and behavior. This work directly bridges and complements existing monkey neurophysiology work on the PPC. My continuous effort to understand the neural coding schemes used by the OTC and the PPC for visual information processing will not only enrich our understanding of the human brain, but can also facilitate the development of better computational models that are capable of implementing vital human visual abilities in artificial intelligence.