Face Verification Schemes

ISSUE 1, VOL. 1, YEAR 1, JAN. 2005, Pages 31-38

Face Verification Schemes for Mobile Personal devices

Sabah Jassim & Harin Sellahewa

Department of Information SystemsUniversity of Buckingham, Buckingham MK18 1EG, U.K.sabah.jassim@buckingham.ac.uk & harin.sallahewa@buckingham.ac.uk

Abstract: The fast-growing public ownership and use of mobile personal programmable communication devices (e.g. PDA’s) create new and exciting opportunities for convenient and secure commercial and service providing transactions. The newer generations of these devices incorporate digital cameras and signature pads, raising the possibility of enhancing the security of mobile transactions using biometric-based authentication. However, implementing biometrics-based athentication on devices that are constrained in their memory size and computational power is a tough technological task. This paper is primarily concerned with wavelet techniques for face verification as part of a multi-modal biometrics-based identity verifier for PDA’s that would include voice and hand-written signature. We shall present preliminary results of comparison of accuracy rates of three face verification schemes: PCA applied in the spatial domain, PCA applied in the low-low subband of wavelet transformed images down to level 5, and wavelet-only feature vectors. Although the experimental sample is relatively small, the results indicate that at level 4 and 5 of wavelet decomposition, the latter two verification schemes perform significantly better than the spatial PCA. The wavelet-only feature vector scheme has the best accuracy rate, and is the most efficient scheme suitable for constrained devices.

Keywords: Wavelet, Biometrics, PCA, Eigenfaces, and SecurePhone

1. Introduction

The rapid advances in Internet technology revolutionised the way business transactions and service delivery is carried out anytime and anywhere. However, the Internet is the main source of security breaches. Confidentiality of online transactions while in transit is protected by cryptographic tools like SSL. However, such tools do not protect against fraud or repudiation. The fast growing public ownership and use of mobile personal programmable communication devices (e.g. PDA’s) create new and exciting opportunities for convenient and secure commercial and service providing transactions. The newer generations of these devices incorporate digital cameras and signature pads, raising the possibility of using biometric-based authentication to protect mobile transactions against fraud and repudiation. SecurePhone is a European funded project that aims to develop a multi-modal biometric verifier for PDA’s, using voice, face and handwritten signature. The work reported here, on face verification, feeds into the research work carried out by the SecurePhone consortium.

Human Identification based on facial images is one of the most challenging tasks in comparison to identification based on other biometric features such as fingerprints, palm prints or iris. Yet, due to its unobtrusive nature, facial recognition is naturally the most suitable method of identification for security related applications ([1, 2]). The availability of high resolution, low cost digital image capturing devices makes accurate facial recognition a near possibility today, but the implementation of face recognition in small devices (like smart cards, mobile phones and PDA's) that are increasingly becoming a part of our day-to-day life is a tough challenge due to the constrained memory and computation power of such devices. The tragic events of 9/11, the rise of international terrorism in recent years, and the rapid increase in credit card fraud have added urgency to the need for new efficient facial and speech based biometrics as a reliable alternative to conventional method of person verification. In future, such systems may be used to provide an additional layer of security for online transactions or for real-time surveillance.

Identity checks come in two forms: recognition whereby a person (or a representation of a person) presented to the system is to be identified as a member of a group of persons; and verification, whereby the system is required to verify a claimed identity. In this paper, we are considered with verification (i.e. the one-to-one identity check). An important part of a face recognition/verification process is the feature extraction of a given facial image. Two current approaches to feature extraction are the geometry feature-based methods and template-based methods [1-6]. In the latter approach, which is the most common one, the entire face image is statistically analysed to obtain a set of feature vectors that best describe a given face image. A typical face image is represented by a high dimensional array (e.g. 120x120). To avoid computational complexities and to reduce redundant data, face images are first linearly transformed into a low dimensional subspace and then a feature vector is extracted. Typical dimension reduction methods are based on PCA [1-3], 5], LDA (Linear Discriminate Analysis) [4, 6, 7], ICA (Independent Component Analysis) [1] or a combination of these methods [6, 8].

Performance of a face recognition/verification system is affected by variations in facial images due to illumination, pose, occlusion, facial expressions and scale [1, 3, 5]. A good scheme should be robust under these variations. To accommodate such flexibility, a number of sample images for each individual, which encapsulates most of these variations are required for the training/enrolment step [1-3]. Adding more images to the training set increases computational costs and requires extra storage for feature vectors. The accuracy of biometrics-based verification over the Internet could be undermined as a result of high ratio compression that is necessitated by bandwidth and network performance, [9, 10].

Here we propose a wavelet-based pre-processing step that reduces the amount of training data to be followed by application of PCA on the low subband image coefficients to extract feature vectors of the training set of face images. Reducing image dimensions using wavelet decomposition and then applying PCA (or other feature extraction methods) will reduce computations (smaller covariance matrixes, small images) and storage requirements. Recent dimension reduction approaches to face recognition include downsizing images by averaging, ranklets and waveletfaces (e.g. [2, 5, 6, 11]). We have experimented with downsizing by rescaling and found that though such a scheme performs reasonably well, it is outperformed by the wavelet-only scheme. The advantage of wavelet decomposition over image downsizing (e.g. by averaging non-overlapping blocks) is that wavelet decomposition retains all information of the original image that can be used for de-noising, face location and compression [12]. Indeed, a face location scheme has been developed within our group that uses the non-LL subbands with very high accuracy rate. The significance of wavelet-based schemes for constrained devices stems from the fact that the size of the k-th low subband is 1/4k of the original image size. Moreover, working at wavelet decompositions beyond subband 3 is expected to result in robustness against high rate compression and noise interference.

2. PCA (Eigenfaces) for Face recognition

Principle component analysis, also known as Karhunen-Loeve transform, aims to find the principle components (eigenvectors/eigenfaces) of a given set of facial images and represent each face image as a point in this lower dimensional face-space using the eigenvectors/ eigenfaces that correspond to few of the highest eigenvalues ([3]). The new coordinates of the original faces in the eigenface-space, form their feature vectors. Let Γ ={T1, T2., T3,..., TM} be the set N x N training set of images, their covariance matrix is calculated as:

Where Φi = Ti ¡V A, and A is the Average image of the training set Γ. The Principal Components/eigenfaces of the set Γ of Images are the eigenvectors of the covariance matrix C. Determining eigenvectors and eigenvalues for an N² x N² matrix C when N is a large number is an intractable task. Turk and Pentland, [3], describes a solution for the above problem by first solving for the eigenvalues and their corresponding eigenvectors of an M by M matrix where M is the number of face images (when M < N²) in the training set and then taking appropriate linear combinations of the face images Φi to find the M eigenfaces. Such a matrix is symmetric and has M, not necessarily distinct, real eigenvalues. For recognition, it is sufficient to use a smaller number M' (M' < M) of eigenfaces, namely those corresponding to the M¡¦ most significant eigenvalues. These M¡¦ eigenfaces span a lower dimensional Eigenface-space. Only a small number of eigenfaces are needed to have a very good approximation of the feature vectors of the training images (see Figure 1). The projection coordinates of the training face images in this lower dimensional Eigenface-space are the feature vectors of the training images. The feature vector of a training face Ti is a linear combination of the chosen eigenfaces, whose coefficients are simply the inner product of Φi = Ti - A with the corresponding eigenfaces. Earlier, the PCA scheme was based on the use of one feature vector to represent an enrolled subject, namely the centroid of the projections of its training images in the Eigenface-space. But more recent schemes represent a subject by the set of feature vectors obtained for each of its training images. This is expected to result in improved accuracy [2], and our experiments do confirm this.

When a new face image T is presented, its projection vector in the eigenface-space is calculated, and its distance from each of the feature vectors in the training set will be calculated. T will be matched to the subject from the training set whose feature vector is the closest, within acceptable thresholds. Similarity measures, such as the Euclidean, Cosine or Mahalanobis distance functions are among the most commonly used criteria. For face verification, the thresholds are subject dependent and can be determined through the testing stage to maximise accuracy rate.

Many argue that PCA does not project the similarities between small variations of the same class (variants of the same face should be projected near each other in the face space) as well as it projects the differences between different classes (face images of different individuals should be projected far from each other) [6]. An alternative to PCA is LDA/Fisherfaces, but in practice when the training set is large, PCA or other dimension reduction schemes has to be used to reduce dimension of the face space before applying LDA to extract features, e.g. [1, 6].

3. Wavelet Transforms

The Wavelet Transform is a technique for analyzing finite-energy signals at multi-resolutions. It provides an alternative tool for short time analysis of quasi-stationary signals, such as speech and image signals, in contrast to the traditional short-time Fourier transform. The Discrete Wavelet Transform (DWT) is a special case of the WT that provides a compact representation of a signal in time and frequency that can be computed very efficiently. The DWT is used to decompose a signal into frequency subbands at different scales. The signal can be perfectly reconstructed from these subband coefficients. The mathematical properties of the DWT can be seen as equivalent to filtering the input image with a bank of band-pass filters whose impulse responses are approximated by different scales of the same mother wavelet. It allows the decomposition of a signal by successive highpass and lowpass filtering of the time domain signal respectively, after sub-sampling by two. Consequently, a wavelet-transformed image is decomposed into a set of subbands with different resolutions each represented by a different frequency band.

There are a number of different ways of applying a 2D-wavelet transform. The usual and most commonly used wavelet decomposition of an image, is the pyramid scheme. Throughout this paper, we use the pyramidal scheme. At a resolution depth of k, the pyramidal scheme decomposes an image I into 3k +1 subbands, {LLk, LHk, HLk, HHk, LHk-1, HLk-1,…, LH1, HL1}, with LLk, being the lowest-pass subband. The subbands LH1 and HL1, contain finest scale wavelet coefficient, and the coefficients get coarser with LLk being the coarsest. Although, the transformed version of I is not an image but it is customary to use a scaling function to produce a greyscale image representation of it in which the low subband LLk looks like a smoothing of the original image (see Figure 2). Therefore, LLk is considered as the k-level resolution approximation of the image I.

Recently, wavelet transforms have been used for face recognition, mostly combined with LDA schemes, e.g. [6, 13, 14]. In this work, which is part the second author’s Ph.D. project, the PCA is applied to the LLk of facial images for k = 1, 2, … , 5 and the performance of these systems are compared with the performance of the original PCA system (i.e. applied to the original images). We will also demonstrate that the use of the LL-subband itself as the face feature vector results in comparable or even higher accuracy rate.

4. PCA in the Wavelet Domain

Given the training set above. Applying a wavelet transform on each of the training images results in a set Wk(Γ) of multi-resolution decomposed images. Let Lk(Γ) be the set of all k-level low subbands obtained from the elements in the set Wk(Γ). Instead of applying PCA on the original training set, our approach is to use the PCA on the set Lk(Γ) whose elements are the training vectors in the wavelet domain (i.e. LLk subbands). It is worth noting that each wavelet coefficient in the LLk subband is a function of a 2kx2k block of pixels in the original image representing the scaling of the total energy in the block scaled by the given mother wavelet.

There are many different wavelet filter banks to use in the transformation stage, and it is possible that the choice of the filter may influence the accuracy rate of the PCA in the wavelet domain. However, in this paper we only report on the use of the Haar filter. The Haar filter is an orthogonal filter of length 2, and hence the LLk coefficients represent non-overlapping 2kx2k blocks in the original image.

5. Experimental results & Discussion

To test the performance of the various schemes, we conducted experiments on a number of different face image databases. In this paper we only report the results of face verification for an in-house recorded collection of facial video images. The larger set of experiments will be presented in the near future with more extensive analysis and discussions.

The facial images in the collection were acquired from a number of video clips belonging to 20 subjects (average 12 videos per subject). Videos were taken to capture facial expressions and up to 3 different facial area sizes. Small changes in pose and illuminations were also incorporated. All videos of a subject were taken on the same day. Facial images of size 160x192 were extracted manually from these videos. Figure 3 shows a small sample of facial images used in our experiments. In what follows, we use slightly modified concepts for error rates. Here, an FAR (FRR) is the ratio of those falsely accepted (rejected) faces out of all tested face images.

Enrolment/Training module. This is based on using 10 facial image frames from 5/6 different videos (at most 2 frames each video) of the training video set. We used between 5 and 10 Eigenfaces to calculate the projection weights.

The testing module. For each of the trained subjects, to test for true acceptance and false rejection, we chose 165 frames from their recorded videos. These included frames from the training videos, other than those used in the training video set, as well as from other video recordings of them. We also tested for true rejection and false acceptance, using 19 impostors. Each impostor contributed to this test with 10 frames from different videos.

These experiments were repeated for 8 subjects, mainly first year university students (3 males and 5 females). The average accuracy rates for the various schemes from these experiments are presented in Chart 1, which also include the accuracy rate for the wavelet–only feature vectors in level 4, and 5. Except for the wavelet-only features, these results are based on using different number of eigenfaces for matching as indicated. All these accuracy rates correspond to FAR (False Acceptance Rate) of <1%. The results indicate that, the accuracy of the PCA is greatly improved in the wavelet domain using any number of eigenfaces, with best rates achieved at level 5. Moreover, the use of wavelet-only feature vectors yields significant accuracy rates namely 96% at LL subbands of 4 and 5.

These high accuracy rates achieved in LL5 subband, for all the wavelet-based schemes, have significant implications for constrained platforms and devices because the size of LL5 is only 1/1024 of the size of the original face image. This would yield significant savings in the verification time and storage needed for the feature vectors. It also means that for more accuracy we can well afford the use of more eigenfaces in the wavelet domain than in the spatial domain.

Interestingly, variation in facial expressions and/or the use of glasses did not contribute to false rejections in LL5, but variations in scale, pose and illumination did. During the training, one subject was wearing glasses but some of his/her testing frames were without glasses, and another subject had no glasses in training but used glasses in some of his test frames.

Since the LLk coefficients represent the scaled energy in non-overlapping 2kx2k blocks in the original image, hence the larger k is the smaller variation can be detecting between corresponding blocks in different images of the same person. Consequently, the improvement in the wavelet based PCA for identification may be due to the fact that variation within the same class (variants of face images of the same person) is reduced by the wavelet transform.

6. Conclusion

We have presented wavelet based face verification/identification techniques that are either based on using the PCA in the wavelet domain rather than the spatial domain, or using the low-subbands as face feature vectors without any further processing. Together, with the results of more extensive set of experiments to be reported later, the results demonstrate that the use of wavelet transform as a preprocessing dimension reduction step prior to applying PCA results in significant improvement in accuracy as well as efficiency on the spatial PCA scheme for verification. The size of the LLk subband is 1/4k of the original size, (see figure 4)

The results point to the superiority of using wavelet-only feature vectors over the other examined face verification schemes. Beside accuracy and efficiency, wavelet-only scheme require no training and hence the system can be allowed to evolve with time. Whenever necessary new image samples can be added to the stored set.

Future work include investigation of clustering of feature vectors for individuals in the Eigenface-space, developing a strategy for training image selection and numbers, as well as the effect of the choice of the wavelet filter. We will also work to incorporate the wavelet-based face detection scheme, developed within the group, using the non-LL subbands.

References

    1. W. Zhao, R. Chellappa, A. Rosefeld, and P.J. Phillips. "Face Recognition: A Literature Survey," Technical report, Computer Vision Lab, University of Maryland, 2000.

    2. T. Sim, R. Sukthankar, M. D. Mullin, and S. Baluja. "High-Performance Memory-based Face Recognition for Visitor Identification," ICCV-99, Paper No. 374.

    3. M. Turk, and A. Pentland. "Eigenfaces for Recognition," Journal of Cognitive Neuroscience, vol. 3, no. 1, pp. 71-86, 1991.

    4. H. Yu, and J. Yang. "A Direct LDA Algorithm for High-Dimensional Data - with Application to Face Recognition," Pattern Recognition, vol 34, pp. 2067-2070, Sept., 2000.

    5. R. Gottumukkala, and V. K Asari, "A Robust Face Authentication Technique Based on Composite PCA Method", CISST Proc. Int'l Conf, Conference on Imaging Science, Systems and Technology, pp. 201-213, June, 2003.

    6. Jen-Tzung Chien, and Chia-Chen Wu. "Discriminant Waveletfaces and Nearest Feature Classifiers for Face Recognition," IEEE Transaction on Pattern Analysis and Machine Intelligence, vol. 24, no. 12, pp. 1644-1649, December, 2002.

    7. Ming-Hsuan Yang. "Kernel Eigenfaces vs. Kernel Fisherfaces: Face Recognition Using Kernel Methods," Proc. IEEE Int'l Conf. Automatic Face and Gesture Recognition, pp. 215-220, May, 2002.

    8. R. Gross, J. Shi, and J Cohn. "Quo vadis Face Recognition?," 3rd Workshop on Empirical Evaluation Methods in Computer Vision, December, 2001.

    9. J. Siau, and A. M. Ariyaeeinia, "Data Transmission in Biometrics over the Internet", Proc. Of Cost 275 workshop, The Advent of Biometrics over the Internet, Italy, 51-54, 2002.

    10. N. W. D. Evans, J. S. Mason, R. Auckenthaler, and R. Stapert, "Assessment of Speaker Verification Degradation Due to Packet Loss in The Context of Wireless Mobile Devices", Proc. Of Cost 275 workshop, The Advent of Biometrics over the Internet, Italy, 47-50, 2002.

    11. F. Smeraldi. "A Nonparametric Approach to Face Detection Using Ranklets," Proc. AVBPA Int'l Conf. Audio-and Video-Based Biometric Person Authentication, pp. 351-359, June, 2003.

    12. D. Xi, and Seong-Whan Lee. "Face Detection and Facial Component Extraction by Wavelet Decomposition and Support Vector Machines," Proc. AVBPA Int'l Conf. Audio-and Video-Based Biometric Person Authentication, pp. 199-207, June, 2003.

    13. A. Z. Kouzani, F. He, and K. Sammut. "Wavelet Packet Face Representation and Recognition," Proc IEEE Conf. Systems, Man, and Cybernetics, pp. 1614-1619, 1997.

    14. Dao-Qing Dai, and P. C. Yuen. "Wavelet-Based 2-Parameter Regularized Discriminant Analysis for Face Recognition," Proc. AVBPA Int'l Conf. Audio-and Video-Based Biometric Person Authentication, pp. 137-144, June, 2003.

Charts and Images

Chart. Average Individual Verification accuracy rates

Figure 1. Recreating a face image using its projection onto various number of significant Eigenfaces

Figure 2. (a) Original Image I, (b) 1-stagewavelet transformed I, (c) 2-stages wavelet transformed image I

Figure 3. A sample of images in the in-house collection.

Figure 4. (a) Eigenfaces in spatial domain, (b) Eigenfaces in LL1 (c) Eigenfaces in LL2

ZANÍN, [KSMA], E-mail: contact@kmsf.org.uk, URL: kmsf.org.uk