This presentation examines the vulnerability of deep learning models to adversarial attacks, specifically using the FGSM (Fast Gradient Sign Method). The study compares the robustness of three representative network architectures—CNN, ResNet-18, and AlexNet—against adversarial perturbations on the CIFAR-10 dataset. It evaluates model performance under varying attack intensities (𝜖 values), analyzes both full and partial attacks, and performs cross-model evaluations to determine the generalizability of adversarial examples. Ultimately, the research aims to provide foundational insights for designing more robust AI systems in security-critical applications.
Deep learning models have achieved remarkable performance in tasks like image classification and natural language processing. However, these models are critically vulnerable to adversarial attacks: minor, often imperceptible perturbations to the input can lead to significant misclassifications.
Adversarial Attacks: Techniques that subtly alter input data (e.g., by adding noise) to mislead the model.
Research Questions:
How do different neural network architectures respond to FGSM-based adversarial attacks?
Can partial attacks reveal localized vulnerabilities within these models?
Do adversarial examples generated from one model transfer and deceive other models?
✅ Provides a comparative analysis of CNN, ResNet-18, and AlexNet under FGSM attacks.
✅ Examines the effects of full attacks versus partial attacks to identify region-specific vulnerabilities.
✅ Conducts cross-model evaluations to assess whether adversarial examples are transferable across different architectures.
✅ Offers experimental evidence to guide the development of more resilient neural network designs for security-sensitive applications.
FGSM (Fast Gradient Sign Method): An attack method that generates adversarial samples by using the gradient of the loss function with respect to the input.
Formula: 𝑥′ = 𝑥 + 𝜖 · sign(∇ₓ J(𝜃, 𝑥, 𝑦))
A larger 𝜖 increases the attack intensity, allowing the model to be misled with minimal perturbations.
Dataset: CIFAR-10, consisting of 60,000 32×32 color images categorized into 10 classes (50,000 for training, 10,000 for testing).
Neural Network Models:
CNN: A simple architecture using convolutional and pooling layers to capture local features.
ResNet-18: Utilizes residual connections to facilitate training in deeper networks and mitigate the vanishing gradient problem.
AlexNet: A relatively shallow network with large filters for feature extraction.
Experimental Environment & Hyperparameters:
Implemented using the PyTorch framework on an Nvidia RTX 4060 GPU.
Learning Rate: 0.01, Batch Size: 64, Optimizer: Adam, Epochs: 50.
FGSM attack intensity (𝜖 values) ranging from 0 to 1.0.
Attack Types:
Full Attack: FGSM applied to all pixels of the image to generate adversarial examples.
Partial Attack: FGSM applied only to specific regions (e.g., top, center, border) to examine localized vulnerabilities.
Cross-model Evaluation: Evaluates whether adversarial examples generated from one model remain effective when used against other models.
Full Attack Findings:
At 𝜖 = 0.01, most samples are still correctly classified.
At 𝜖 = 0.1, there is a sharp decline in classification accuracy across all models.
For 𝜖 ≥ 0.3, nearly all samples are misclassified.
Partial Attack Findings:
CNN shows significant performance degradation even when attacks target only specific regions.
ResNet-18 demonstrates relative robustness due to its residual architecture.
AlexNet is the most vulnerable under full attacks, although CNN is particularly sensitive to partial attacks.
Cross-model Evaluation:
Adversarial examples generated by one model often lead to high misclassification rates in other models, with AlexNet-derived examples being especially aggressive.
Architectural Vulnerability Analysis: The study experimentally confirms that different network architectures exhibit varying degrees of vulnerability to FGSM attacks.
Generalization of Adversarial Examples: It demonstrates that adversarial samples are not model-specific; examples crafted from one architecture can deceive others, highlighting a widespread threat.
Guidance for Robust Model Design: The experimental findings provide valuable data for designing more resilient AI systems in applications such as autonomous driving, medical diagnostics, and financial forecasting.
Study Limitations:
The evaluation is limited to FGSM attacks, without testing more potent techniques such as PGD or CW.
Experiments are confined to the CIFAR-10 dataset, necessitating further validation in real-world scenarios.
Future Research Directions:
✅ Expand the analysis to include additional adversarial attacks (e.g., PGD, CW, DeepFool).
✅ Explore various defense mechanisms, such as adversarial training, input preprocessing, and gradient masking.
✅ Assess the implications of adversarial attacks in practical settings like autonomous vehicles and medical imaging.
8️⃣ Conclusion
📌 Key Takeaways:
FGSM-based adversarial attacks reveal distinct vulnerabilities across different neural network architectures—CNN and AlexNet are notably susceptible, while ResNet-18 remains relatively robust.
Both full and partial attack strategies demonstrate that adversarial examples can compromise model performance, and these effects are transferable across models.
📌 Final Thoughts:
✅ This study provides a systematic analysis of adversarial vulnerabilities in deep learning models and offers experimental evidence to support the development of more robust network designs.
✅ Future research should integrate multiple attack and defense strategies to enhance the security of AI systems in real-world applications.