Case Study 05: Adversarial Attacks and Model Exploitation

In the fifth test case, we investigated how generative AI can be used not only to defend against threats, but also to exploit AI systems through adversarial attacks. We trained a binary classification model on the NSL-KDD dataset using TensorFlow and Keras to detect network intrusions. Once the model was trained, we applied the Fast Gradient Sign Method (FGSM) to generate adversarial inputs—slightly altered versions of the original data that appear normal but are specifically crafted to trick the model into misclassifying them. These were calculated using the model’s own gradients, enabling efficient generation of misleading inputs. The adversarial examples were then evaluated against the original classifier to test its vulnerability. Finally, the clean and adversarial samples, along with their predictions, were exported to an Excel sheet to illustrate how even small, AI-crafted changes can bypass machine learning models. This test case highlights the dual role of generative AI in cybersecurity: while it can improve detection accuracy, it can also be weaponized to fool AI systems, underscoring the need for more robust and adversarially-trained defenses.

Fig. 6 (Adversarial Attacks and AI Model Exploit Generation). Training vs. Validation Accuracy across epochs during model training. The figure demonstrates the model’s high learning efficiency, showing minimal overfitting and consistently strong validation performance. This supports the robustness of the model prior to adversarial input generation.

The adversarial attack demonstrates the vulnerability of AI models to even small, targeted changes in input data. Although the model initially performed with high accuracy on both training and validation sets, the FGSM-based adversarial samples significantly reduced its performance. These artificially crafted inputs were nearly identical to the original data but were specifically designed to mislead the model. The drop in accuracy during adversarial evaluation highlights how attackers can exploit model weaknesses without needing full access to internal mechanisms. This reinforces the importance of integrating adversarial robustness into AI systems, especially in cybersecurity applications where threat actors may attempt to bypass detection systems using similar strategies.

Page updated

Google Sites

Report abuse