face databases. A sample of this database is shown in Fig. 2. In most of the real world applications, face identification systems are first trained on a training database and then the trained system is used to perform recognition on the test database. In such applications, it is highly likely that there is no overlap between the subjects used in the training database and the subjects in the test database. To evaluate the performance of face recognition algorithms in such an application scenario, the database is partitioned into two groups: training and testing. Face images pertaining to 360 subjects (40% of the database) are used to train the face recognition algorithms and the remaining images pertaining to 540 subjects (60% of the database) are used as the test database for performance evaluation. The non-overlapping train-test partitioning is repeated 10 times and recognition performance is computed in terms of identification accuracy. Cumulative Matching Curves (CMC) are generated by computing the identification accuracy over these trials for top 10 ranks. CMC in Fig. 3(a) and rank-1 identification accuracy reported in Table II shows that face recognition algorithms such as SURF, CLBP and GNN provide good accuracy (73-84%). 2) Performance on Plastic Surgery Database: With the same experimental protocol (as described for Experiment 1), we partition the plastic surgery database in non-overlapping training (360 database) and testing datasets (540 subjects) and compute the identification accuracy of face recognition algorithms. Fig. Algorithm Non-Surgery Database Plastic Surgery shows the CMC curve and Table II reports rank-1 identification accuracies of this experiment. On the plastic surgery database, it is observed that the best rank-1 identification accuracy is 54% which is about 30% lower when the same algorithm is evaluated with the non-surgery database. 3) Performance with Training on Non-Surgery Database and Testing on Plastic Surgery Database: In general, face recognition algorithms are unlikely to be trained using pre and post surgery images. Therefore, in the third experiment, we use 360 subjects from the non-surgery database for training the algorithms (i.e., training data from experiment 1) and 540 subjects from the plastic surgery database for testing (i.e., testing data for experiment 2). The results of this experiment are documented in Table III. This table also shows a comprehensive breakup of results according to the type of surgeries performed. The key observations and analysis of the three experiments are summarized below: • Fig. 3 and Table II shows the actual decrement in the identification performance of face recognition algorithms due to plastic surgery. For example, PCA yields 59.3% rank-1 identification accuracy and when using the non-surgery database (face images with neutral expression, proper illumination and no occlusion). On the other hand, the accuracy decreases by 30% when ˜ evaluated with pre and post surgery face images. Similarly, the performance of other face recognition algorithms decreases by 26-30%. This comparison accentuates that plastic surgery is a very challenging problem and hence it is required to develop algorithms to confound these effects. • As shown in Table III, face recognition algorithms cannot handle global facial plastic surgery such as skin resurfacing and full face lift. With 10 times cross validation, the performance of recognition algorithms varies in the range of 18-54% which is not acceptable in real world applications. In most of the test cases, for global surgery, differences between pre and post surgery images of the same individual is very large. In other words, facial features and texture are drastically altered after surgery and hence the algorithms do not yield good performance. For some test cases of skin resurfacing that have relatively closer resemblance in pre and post surgery images, most of the recognition algorithms are able to perform correct classification. However, with major skin resurfacing such as surgeries to look younger, none of the algorithms are able to correctly classify the faces. Dermabrasion is another important and common surgical procedure that affects the face recognition performance. • Among different types of plastic surgery, Otoplasty, i.e. ear surgery has lowest effect on the performance of face recognition. On the other hand, local facial regions such as nose, chin, eyelids, cheek, lips and forehead play an important role in face recognition. Any change in one of the regions, in general, affects the identification accuracy. For example, in LFA, nose and eyes play an important role and most of the local features are found ACCURACY OF FACE RECOGNITION ALGOR ITHMS. close to these regions. Any change in these regions degrades the identification performance. • Overall, with variations in both global and local surgeries, rank1 identification accuracies are in the range of 18% (PCA) - 61% (GNN). It is to be noted that these results are computed on frontal images with neutral expression and proper illumination. If we include other covariates such as pose, expression and illumination, the performance may further deteriorate. • The results of experiment 2 and 3 show that the performance of face recognition algorithms is slightly better when they are trained on pre and post surgery images compared to