Deep learning / Computer vision

The Algonauts challenge is about predicting neural object representations in the form of Representational Dissimilarity Matrices (RDMS) derived from visual brain regions. We used a customized Siamese deep learning model and used group convolutions to predict neural distances corresponding to image-pairs.

Predicting population neural activity in the Algonauts challenge using end-to-end trained Siamese networks and group convolutions, arXiv preprint arXiv:2001.05841

We used custom deep convolutional architectures to as a proxy for biological vision and demonstrate the functional benefits of foveal vision. Our results suggest that foveal vision can result in greater object classification accuracy.

Human peripheral blur is optimal for object recognition RT Pramod, H Katti, SP Arun Vision Research 200, 108083 https://www.sciencedirect.com/science/article/abs/pii/S004269892200089X

We compare the performance of popular face categorisation deep convolutional networks with human behavoiur in a fine grained categorisation task. Despite training with large datasets and extensive computational resources, deep networks are unable to learn human like representations.

Harish Katti; S. P. Arun, Are you from North or South India? A hard face-classification task reveals systematic representational differences between humans and machines Journal of Vision. 2019;19(7):1 doi:10.1167/19.7.1, Code and dataset: https://github.com/harish2006/IISCIFD

In this series of papers, we have evaluated the effectiveness of domain transfer in deep learning, for prediction of affective information in advertisement video content. We used novel approaches such as fusion of audio-visual information at the input stage for deep learning. We also evaluated multi-task learning approaches for this problem.

A Shukla, SS Gullapuram, H Katti, M Kankanhalli, S Winkler, S Ramanathan, Recognition of Advertisement Emotions with Application to Computational Advertising,arXiv preprint arXiv:1904.01778 (research article, under review)

A Shukla, H Katti, M Kankanhalli, R Subramanian, Looking Beyond a Clever Narrative: Visual Context and Attention are Primary Drivers of Affect in Video Advertisements, ICMI, 2018 (research article)

A Shukla, SS Gullapuram, H Katti, K Yadati, M Kankanhalli,, Evaluating content-centric vs. user-centric ad affect recognition, Proceedings of the 19th ACM ICMI, 5 2017 (research article)

A Shukla, SS Gullapuram, H Katti, K Yadati, M Kankanhalli, Affect recognition in ads with application to computational advertising, Proceedings of the 25th ACM MM, 1148-1156 (research article)

We explored whether representations learnt by deep convolutional networks and popular computer vision models for object structure, can predict human expectations and detection responses. We demonstrate that late decision fusion of deep convolutional networks with human derived priors can improve their accuracy.

H Katti, MV Peelen, SP Arun, Machine vision benefits from human contextual expectations, Scientific reports 9 (1), 2112 (research article) Code and dataset: https://github.com/harish2006/cntxt_likelihood

H Katti, MV Peelen, SP Arun, How do targets, nontargets, and scene context influence real-world object detection? Attention, Perception, & Psychophysics 79 (7), 2021-2036 (research article)

We created a stereoscopic dataset of indoor natural scenes and first characterized the effect of depth information on visual salience during free viewing. We proposed a computational model for modulation arising from depth cues as well.

C Lang*, TV Nguyen*, H Katti*, K Yadati, M Kankanhalli, S Yan, Depth matters: Influence of depth cues on visual saliency, European conference on computer vision, 101-115 (research article) NUS3D-Saliency Dataset https://sites.google.com/site/vantam/nus3d-saliency-dataset

We showed the usefulness of eye fixation priors in examples of hard object segmentation problems in natural scenes.

S Ramanathan, H Katti, N Sebe, M Kankanhalli, TS Chua, An eye fixation database for saliency detection in images, Computer Vision–ECCV 2010, 30-43 (research article) NUSEF eye fixation dataset, http://ncript.comp.nus.edu.sg/site/mmas/NUSEF.html