LAMDA: Label Agnostic Mixup for Domain Adaptation in Iris Recognition
Prithviraj Dhar, Khushi Gupta, Rakesh Ranjan
IEEE International Joint Conference on Biometrics (IJCB) 2024
Iris Recognition (IR) is one of the most effective biometric authentication techniques available today. It is an obvious candidate for authentication on most head mounted devices. Networks for IR trained on datasets collected by existing hardware may not generalize to newer hardware due to domain gap induced by changes in sensor configurations noise, resolution, camera placements etc. Coupled with the challenge of acquiring high quality iris samples, domain adaptation in IR is an important topic that remains poorly studied in literature. We introduce the problem of supervised Domain Adaptation (DA) for IR, where we assume access to abundant source training data, but extremely limited labeled target training data. Additionally, we propose a novel mixup strategy called LAMDA that mitigates the domain gap between source and target IR datasets by augmenting samples from these datasets. Unlike existing mixup techniques, LAMDA does not require performing label-mixup, and outperforms existing DA techniques in almost all of our problem settings, irrespective of the availability of target training data, and across various image quality degradations. As a way to facilitate research, we also introduce new dataset splits for the problem.
EyePAD++: A Distillation-based approach for joint Eye Authentication and Presentation Attack Detection using Periocular Images
Prithviraj Dhar, Amit Kumar, Kirsten Kaplan, Khushi Gupta, Rakesh Ranjan, Rama Chellappa
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022 ArXiv link
A practical eye authentication (EA) system targeted for edge devices needs to perform authentication and be robust to presentation attacks, all while remaining compute and latency efficient. However, existing eye-based frameworks a) perform authentication and Presentation Attack Detection (PAD) independently and b) involve significant pre-processing steps to extract the iris region. Here, we introduce a joint framework for EA and PAD using periocular images. While a deep Multitask Learning (MTL) network can perform both the tasks, MTL suffers from the forgetting effect since the training datasets for EA and PAD are disjoint. To overcome this, we propose Eye Authentication with PAD (EyePAD), a distillation-based method that trains a single network for EA and PAD while reducing the effect of forgetting. To further improve the EA performance, we introduce a novel approach called EyePAD++ that includes training an MTL network on both EA and PAD data, while distilling the `versatility' of the EyePAD network through an additional distillation step. Our proposed methods outperform the SOTA in PAD and obtain near-SOTA performance in eye-to-eye verification, without any pre-processing. We also demonstrate the efficacy of EyePAD and EyePAD++ in user-to-user verification with PAD across network backbones and image quality.
(Top row) Face recognition networks attend to different spatial regions in faces, depending on protected attributes (here, shown for skintone attribute). (Bottom row) Our proposed method D&D++ enforces a network to attend to similar spatial regions for both light and dark skintones, and consequently reduces skintone bias. We report similar findings with respect to the gender attribute.
Distill and De-bias: Mitigating Bias in Face Recognition using Knowledge Distillation
Prithviraj Dhar, Joshua Gleason, Aniket Roy, Carlos D. Castillo, P.J. Phillips and Rama Chellappa
Face recognition networks generally demonstrate bias with respect to sensitive attributes like gender, skintone etc. For gender and skintone, we observe that the regions of the face that a network attends to vary by the category of an attribute. This might contribute to bias. Building on this intuition, we propose a novel distillation-based approach called Distill and De-bias (D&D) to enforce a network to attend to similar face regions, irrespective of the attribute category. In D&D, we train a teacher network on images from one category of an attribute; e.g. light skintone. Then distilling information from the teacher, we train a student network on images of the remaining category; e.g., dark skintone. A feature-level distillation loss constrains the student network to generate teacher-like representations. This allows the student network to attend to similar face regions for all attribute categories and enables it to reduce bias. We also propose a second distillation step on top of D&D, called D&D++. For the D&D++ network, we distill the ‘un-biasedness’ of the D&D network into a new student network, the D&D++ network. We train the new network on all attribute categories; e.g., both light and dark skintones. This helps us train a network that is less biased for an attribute, while obtaining higher face verification performance than D&D. We show that D&D++ outperforms existing baselines in reducing gender and skintone bias on the IJB-C dataset, while obtaining higher face verification performance than existing adversarial de-biasing methods. We evaluate the effectiveness of our proposed methods on two state-of-the-art face recognition networks: Crystalface and ArcFace.
PASS: Protected Attribute Suppression System for Mitigating Bias in Face Recognition
Prithviraj Dhar*, Joshua Gleason*, Aniket Roy, Carlos D. Castillo, and Rama Chellappa
IEEE International Conference on Computer Vision (ICCV), 2021
Face recognition networks encode information about sensitive attributes while being trained for identity classification. Such encoding has two major issues: (a) it makes the face representations susceptible to privacy leakage (b) it appears to contribute to bias in face recognition. However, existing bias mitigation approaches generally require end-to-end training and are unable to achieve high verification accuracy. Therefore, we present a descriptor-based adversarial de-biasing approach called ‘Protected Attribute Suppression System (PASS)’. PASS can be trained on top of descriptors obtained from any previously trained high-performing network to classify identities and simultaneously reduce information of sensitive attributes. This eliminates the need for end-toend training. As a component of PASS, we present a novel discriminator training strategy that discourages a network from encoding protected attribute information. We show the efficacy of PASS to reduce gender and skintone information in descriptors from SOTA face recognition networks like Arcface. As a result, PASS descriptors outperform existing baselines in reducing gender and skintone bias on the IJB-C dataset, while maintaining high verification accuracy.
How are attributes expressed in face DCNNs?
Prithviraj Dhar , Ankan Bansal , Carlos D. Castillo , Joshua Gleason, P. J. Phillips and Rama Chellappa
IEEE International Conference on Automatic Face & Gesture Recognition (FG), 2020
As deep networks become increasingly accurate at recognizing faces, it is vital to understand how these networks process faces. While these networks are solely trained to recognize identities, they also contain face related information such as sex, age, and pose of the face. The networks are not trained to learn these attributes. We introduce expressivity as a measure of how much a feature vector informs us about an attribute, where a feature vector can be from internal or final layers of a network. Expressivity is computed by a second neural network whose inputs are features and attributes. The output of the second neural network approximates the mutual information between feature vectors and an attribute. We investigate the expressivity for two different deep convolutional neural network (DCNN) architectures: a Resnet-101 and an Inception Resnet v2. In the final fully connected layer of the networks, we found the order of expressivity for facial attributes to be Age > Sex > Yaw. Additionally, we studied the changes in the encoding of facial attributes over training iterations. We found that as training progresses, expressivities of yaw, sex, and age decrease. Our technique can be a tool for investigating the sources of bias in a network and a step towards explaining the network’s identity decisions.
Prithviraj Dhar*, Rajat Vikram Singh*, Kuan-Chuan Peng, Ziyan Wu, Rama Chellappa
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019
Incremental learning (IL) is an important task aimed at increasing the capability of a trained model, in terms of the number of classes recognizable by the model. The key problem in this task is the requirement of storing data (eg images) associated with existing classes, while teaching the classifier to learn new classes. However, this is impractical as it increases the memory requirement at every incremental step, which makes it impossible to implement IL algorithms on edge devices with limited memory. Hence, we propose a novel approach, calledLearning without Memorizing (LwM)', to preserve the information about existing (base) classes, without storing any of their data, while making the classifier progressively learn the new classes. In LwM, we present an information preserving penalty: Attention Distillation Loss (L_ AD), and demonstrate that penalizing the changes in classifiers' attention maps helps to retain information of the base classes, as new classes are added. We show that adding L_ AD to the distillation loss which is an existing information preserving loss consistently outperforms the state-of-the-art performance in the iILSVRC-small and iCIFAR-100 datasets in terms of the overall accuracy of base and incrementally learned classes.
Our approach assigns an iconicity score of 0.84 and 0.33 to (a) and (b), respectively
On measuring the iconicity of a face
Prithviraj Dhar, Carlos D. Castillo and Rama Chellappa
IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) , 2019
For a given identity in a face dataset, there are certain iconic images which are more representative of the subject than others. In this paper, we explore the problem of computing the iconicity of a face. The premise of the proposed approach is as follows: For an identity containing a mixture of iconic and non iconic images, if a given face cannot be successfully matched with any other face of the same identity, then the iconicity of the face image is low. Using this information, we train a Siamese Multi-Layer Perceptron network, such that each of its twins predict iconicity scores of the image feature pair, fed in as input. We observe the variation of the obtained scores with respect to covariates such as blur, yaw, pitch, roll and occlusion to demonstrate that they effectively predict the quality of the image and compare it with other existing metrics. Furthermore, we use these scores to weight features for template-based face verification and compare it with media averaging of features.