PRCP: Probabilistic Robust Conformal Prediction
Subhankar Ghosh Yuanjie Shi Taha Belkhouja Yan Yan Jana Doppa Brian Jones
Abstract
Conformal prediction (CP) is a framework to quantify uncertainty of machine learning classifiers including deep neural networks. Given a testing example and a trained classifier, CP produces a prediction set of candidate labels with a user-specified coverage (i.e., true class label is contained with high probability). Almost all the existing work on CP assumes clean testing data and there is not much known about the robustness of CP algorithms w.r.t natural/adversarial perturbations to testing examples. This paper studies the problem of probabilistically robust conformal prediction (PRCP) which ensures robustness to most perturbations around clean input examples. PRCP generalizes the standard CP (cannot handle perturbations) and adversarially robust CP (ensures robustness w.r.t worst-case perturbations) to achieve better trade-offs between nominal performance and robustness. We propose a novel adaptive PRCP (aPRCP) algorithm to determine an appropriate threshold during the calibration step to achieve probabilistically robust coverage. The key idea behind our approach behind aPRCP is to determine two parallel thresholds, one for data samples and another one for the perturbations on data design). We provide theoretical analysis to show that aPRCP algorithm achieves robust coverage. Our experiments on CIFAR-10, CIFAR-100, and ImageNet datasets using deep neural networks demonstrate that aPRCP achieves better trade-offs than state-of-the-art CP and adversarially robust CP algorithms.
We show the discrepancies among the three methods. The center red clean data point is bounded by the L2 norm ball.
Vanilla CP: this method provides marginal coverage(not conditioned on either x or y) when there is no distribution shift between training and test data. As can be seen from the figure, it guarantees coverage only for the clean data at the center.
RSCP: This method gives coverage guarantees for all examples inside the circle.
PRCP: This method provides coverage guarantees for samples that come from (1-a) portion of the ball. If a=0, PRCP becomes RSCP.
Advantages compared to existing methods:
Our method(PRCP) provides better efficiencies(less prediction set size) than RSCP for both probabilistic robust coverage and adversarial robust coverage.
During test time RSCP needs to sample m noisy points for each test data, whereas PRCP does not. So, we can claim our method is computationally efficient.
Disadvantages compared to existing methods:
PRCP needs to do hyper-tuning in order to have better efficiencies whereas RSCP does not do so. So, PRCP needs additional data points for hyper-tuning keeping all other things the same for both methods.
RESULTS
ImageNet
CIFAR100
CIFAR10