For the third test case, Pandas and NumPy load the URL CSV and convert it into efficient numeric arrays for modeling. TensorFlow/Keras then tokenizes, embeds, and trains a bidirectional LSTM to classify URLs as malicious or benign, while Scikit‑learn’s train_test_split partitions the data so we can fairly evaluate generalization. Hugging Face’s Transformers library (GPT‑2) generates synthetic phishing URLs, which Python’s built‑in re module filters to keep only valid URL patterns. Finally, Matplotlib plots the training and validation accuracy curves over the epochs, providing a clear visual comparison of the original and augmented models’ performance.
Fig. 5 (Malicious URLs detection using Generative AI). The original model shows the performance of the model's training with an online dataset. While the augmented model shows us the performance of the model's training with both synthetic and online dataset mixed. While it was expected for the augmented model to perform better since it had the synthetic data to train with, the minor difference in accuracy between the two is negligible until it gets to the scale where synthetic data is no longer needed and organic data can be used instead.
We began by unzipping and loading a Kaggle dataset of URLs with Pandas and NumPy, then used Scikit‑learn’s train_test_split to carve out real training and test sets. A character‑level tokenizer and bidirectional LSTM classifier built in TensorFlow/Keras learned to distinguish malicious from benign URLs, reaching 99.4% accuracy on held‑out real data. Next, we invoked Hugging Face’s GPT‑2 to synthesize additional phishing URLs, filtered them via Python’s re module, and augmented our training set. Retraining on this combined dataset yielded 99.2% accuracy—a negligible 0.2% drop that reflects the original data’s abundance rather than a flaw in synthetic generation. Our end‑to‑end pipeline (data loading → preprocessing → model training → synthetic data creation → evaluation → visualization with Matplotlib) proves that generative AI can seamlessly augment scarce threat datasets. Although accuracy didn’t improve here, generative AI shines when real examples are limited: it could produce targeted spear‑phishing URLs tailored to financial or healthcare sectors, or simulate zero‑day attack patterns that haven’t yet appeared in the wild. In the future, regularly using generative AI to create new and challenging threat scenarios could help cybersecurity systems adapt and stay effective against constantly changing attack methods. Fine‑tuning generative models on domain‑specific malware logs or network traffic could enable proactive anomaly detection. Synthetic logs could also stress‑test SIEM systems under realistic attack loads. In the end, although adding synthetic data didn’t improve our accuracy, it demonstrates how generative ai could be a valuable building block for stronger, more flexible cybersecurity defenses in the future.