Hyejin Park and Dongbo Min
Ewha Womans University, Seoul, South Korea
clrara@ewha.ac.kr, dbmin@ewha.ac.kr
Abstract
In the realm of Adversarial Distillation (AD), strategic and precise knowledge transfer from an adversarially robust teacher model to a less robust student model is paramount. Our Dynamic Guidance Adversarial Distillation (DGAD) framework directly tackles the challenge of differential sample importance, with a keen focus on rectifying the teacher model's misclassifications. DGAD employs Misclassification-Aware Partitioning (MAP) to dynamically tailor the distillation focus, optimizing the learning process by steering towards the most reliable teacher predictions. Additionally, our Error-corrective Label Swapping (ELS) corrects teacher's misclassifications on both clean and adversarially perturbed inputs, refining the quality of knowledge transfer. Further, Predictive Consistency Regularization (PCR) guarantees consistent performance of the student model across both clean and adversarial inputs, significantly enhancing its overall robustness. By integrating these methodologies, DGAD significantly improves upon the accuracy of clean data and fortifies the model's defenses against sophisticated adversarial threats. Our experimental validation on CIFAR10 and CIFAR100 datasets, employing various model architectures, demonstrates the efficacy of DGAD, establishing it as a promising approach for enhancing both the robustness and accuracy of student models in adversarial settings.
Dynamic Guidance Adversrial Distillation (DGAD)
In this study, we introduce the Dynamic Guidance Adversarial Distillation (DGAD) framework (see Figure 1), embodying the principle of 'dynamic guidance'. This concept transcends the traditional static approach to weighting distillation processes for clean and adversarial inputs by employing dynamic weighting to optimize the distillation focus. Dynamic guidance entails the real-time recognition and segregation of training inputs within a batch, based on the teacher model's misclassification status of clean inputs. It is followed by the immediate correction of any misclassified labels for both segregated clean and adversarial inputs during distillation. By pinpointing and separating misclassified samples, DGAD enables a customized distillation strategy that optimally addresses both standard and adversarial training needs. We employ three key interventions within this framework to ensure the precise and effective transfer of knowledge to the student model, thereby enhancing its robustness and accuracy.
Figure 1. The overview of Dynamic Guidance Adversarial Distillation (DGAD) framework. The DGAD framework optimizes adversarial distillation by categorizing inputs via Misclassification-Aware Partitioning (MAP), applying Error-corrective Label Swapping (ELS) for teacher's mispredictions, and enhancing learning uniformity with Predictive Consistency Regularization (PCR).
(1) Misclassification-Aware Partitioning (MAP): To realize dynamic weighting, this strategy segregates the training dataset into two subsets based on the teacher model's prediction on clean inputs—The Standard Training (ST) subset comprises clean inputs incorrectly classified by the teacher, emphasizing the correction of these misclassifications during standard training. Conversely, the Adversarial Training (AT) subset includes correctly classified clean inputs by the teacher, utilizing adversarially perturbed versions of these inputs to boost the student model's resistance to adversarial attacks.
(1)
(2) Error-corrective Label Swapping (ELS): Building upon the MAP, ELS is applied to inputs where the teacher model's predictions remain incorrect, specifically including the ST subset and adversarial examples generated from the AT subset. By swapping the misidentified labels for the correct ones, ELS ensures the student model learns from accurate labels, directly addressing and amending the teacher's prediction errors observed during distillation.
{2)
(3) Predictive Consistency Regularization (PCR): Predictive Consistency Regularization (PCR)}: PCR addresses the imbalance between Standard Training (ST) and Adversarial Training (AT), arising from learning exclusively from distinct subsets created during the MAP process. By promoting consistent predictions across the entirety of the data, PCR ensures the student model does not develop biases toward either subset, maintaining balanced and effective learning across all inputs.
(3)
Implementation Details:
We train student models using ResNet18 and MobileNetV2 with the teacher models WideResNet34-10 and WideResNet34-20 on CIFAR10 and CIFAR100, respectively. The training ran for 200 epochs for all student models, with learning rate adjustments at epochs 100 and 150. The initial learning rate was set at 0.1 for ResNet18 and 0.05 for MobileNetV2. For the Tiny ImageNet dataset, we utilized a ResNet18 architecture for the student model, taking PreActResNet18 as the teacher model. The hyperparameter β was configured to a value of 15.
Code & Checkpoints : download
Requirements:
AutoAttack
pip install git+https://github.com/fra31/auto-attack
advertorch
pip install advertorch
Training Commands:
Train a ResNet-18 model on CIFAR10 (or CIFAR100) via the proposed DGAD in the paper by using the teacher model WideResNet-34-10, run the command:
python main.py \
--dataset CIFAR10 \
--model resnet18 \
--method DGAD \
--teacher_model wideresnet34_10 \
--epsilon 8 \
--num_steps 10 \
--step_size 2 \
--epochs 200 \
--bs 128 \
--lr_max 0.1 \
--lr_schedule piecewise \
--ALP_beta 5.0 \
--gpu_id 0
Train a ResNet-18 model on Tiny ImageNet via the proposed DGAD in the paper by using the teacher model PreActResNet-18, run the command:
python main_tiny.py \
--dataset Tiny_Image \
--model resnet18 \
--method DGAD \
--teacher_model presnet18 \
--epsilon 8 \
--num_steps 10 \
--step_size 2 \
--epochs 200 \
--bs 128 \
--lr_max 0.1 \
--lr_schedule piecewise \
--ALP_beta 15.0 \
--gpu_id 0
Evaluation Commands:
Evaluate the trained ResNet-18 model with the proposed DGAD in the paper on CIFAR10 (or CIFAR100) by using the model path, run the command:
python main_eval.py --dataset CIFAR10 --model resnet18 --model_path <model_path/best_PGD10_acc_model.pth>
Evaluate the trained MobileNetV2 model with the proposed DGAD in the paper on CIFAR10 (or CIFAR100) by using the model path, run the command:
python main_eval.py --dataset CIFAR10 --model mobilenetV2 --model_path <model_path/best_PGD10_acc_model.pth>
Evaluate the trained ResNet-18 model with the proposed DGAD in the paper on Tiny ImageNet by using the model path, run the command:
python main_eval_tintimagenet.py --dataset Tiny_Image --model resnet18 --model_path <model_path/best_PGD10_acc_model.pth>
Evaluation Results on Tiny ImageNet-200 : DGAD consistently outperformed state-of-the-art methods including AdaAD and RSLAD. Compared to ARD, DGAD's performance against FGSM, PGD, and AA is slightly lower, but it showed a significant improvement in clean accuracy of 6.4%, while its robustness is comparable.