Uncovering Phishing Attacks using Principles of Persuasion Analysis

Detecting Phishing Attacks in Emails Using Artificial Intelligence Techniques

A novel approach for phishing detection based on the automatic identification of persuasion principles used in malicious messages

Investigation project report

Bustio-Martínez, L., Herrera-Semenets, V., González-Ordiano, J. A., Pérez-Guadarrama, Y., Zúñiga-Morales, L. N., Montoya-Godínez, D., Álvarez-Carmona, M. Á., & van den Berg, J.

Universidad Iberoamericana Ciudad de México, Advanced Technologies Application Center (CENATAV), Cuba, Center for Research in Mathematics (CIMAT), Monterrey Campus, Nuevo León, Mexico, Faculty of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, The Netherlands

Phishing detection using Principles of Persuasion

This research explores the use of message subjectivity for detecting phishing attacks by identifying principles of persuasion (PoP) used in malicious messages. It assesses the impact of various data representations and classifiers on automatically identifying these principles and investigates how they can be leveraged for phishing detection. The study emphasizes the need for user-friendly and comprehensible models, and it finds that tree-based models, particularly Random Forest, are preferred due to their effectiveness and clarity.

This work was supported by the “Instituto de Investigación Aplicada y Tecnología” (InIAT) and the “Universidad Iberoamericana, Ciudad de México” (IBERO). Additionally, the authors thank CONAHCYT for the computer resources provided through the INAOE Supercomputing Laboratory’s Deep Learning Platform for Language Technologies. The web of the project in the InIAT can be found here.

List of publications

Bustio-Martínez, L., Herrera-Semenets, V., González-Ordiano, J. A., Pérez-Guadarrama, Y., Zúñiga-Morales, L. N., Montoya-Godínez, D., Álvarez-Carmona, M. Á., & van den Berg, J. (2025). Enhanced phishing detection using multimodal data. Manuscript submitted to Knowledge-Based Systems. In review, 2nd round.

Rodríguez Díaz, A., Herrara Sements, V., Hernández Sierra, G., Reyes Díaz, F. J., & Bustio Martínez, L. (2025). Detección de phishing en comunicaciones de voz utilizando aprendizaje automático. En SIGESTIC 2025: V Encuentro sobre Sistemas de Gestión para las Tecnologías de la Información y la Comunicación. Varadero, Cuba.

Herrera-Semenets, V., Bustio-Martínez, L., Pérez-Guadarramas, Y., González-Ordiano, J. Á., & van den Berg, J. (2024). Unmasking Phishing Attempts: A Study on Detection in Spanish Emails. In: Hernández-García, R., Barrientos, R.J., Velastin, S.A. (eds) Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. CIARP 2024. Lecture Notes in Computer Science, vol 15369. Springer, Cham. https://doi.org/10.1007/978-3-031-76604-6_1

Bustio-Martínez, L., Herrera-Semenets, V., García-Mendoza, J.L., Álvarez-Carmona, M.A., González-Ordiano, J.A., Zúñiga-Morales, L. Quiróz-Ibarra, J.E., Santander-Molina, P.A., Van den Berg, J. (2024) Uncovering phishing attacks using principles of persuasion analysis, Journal of Network and Computer Applications, 2024, 103964, ISSN 1084-8045, https://doi.org/10.1016/j.jnca.2024.103964.

Bustio-Martínez, L., Herrera-Semenets, V., González Ordiano, J. A., & Álvarez Carmona, M. Á. (2024). Detección de ataques phishing mediante técnicas de Inteligencia Artificial. Komputer Sapiens, 3 (septiembre-diciembre). Recuperado de http://komputersapiens.smia.mx/publicaciones.php#KSXVI-III

Bustio Martínez, L., Herrera-Semenets, V., Álvarez-Carmona, M. A., & González-Ordiano, J. A. (2024). La Inteligencia Artificial en la Ciberseguridad. ReinvenTec. Revista de Ciencia y Tecnología del ITTLA, 1(2). Publicado en marzo de 2024.

Bustio-Martínez, L. et al. (2023). Towards Automatic Principles of Persuasion Detection Using Machine Learning Approach. In: Hernández Heredia, Y., Milián Núñez, V., Ruiz Shulcloper, J. (eds) Progress in Artificial Intelligence and Pattern Recognition. IWAIPR 2023. Lecture Notes in Computer Science, vol 14335. Springer, Cham. https://doi.org/10.1007/978-3-031-49552-6_14

Bustio-Martínez, L., Álvarez-Carmona, M. A., Herrera-Semenets, V., Feregrino-Uribe, C., & Cumplido, R. (2022). A Lightweight Data Representation for Phishing URLs Detection in IoT Environments. Information Sciences, 603, 42-59. https://doi.org/10.1016/j.ins.2022.04.059

data and methodology

Main findings

Principles of Persuasion

This research proposes a novel approach for phishing detection based on identifying principles of persuasion (PoP) in malicious messages.

Data Representations

It explores the impact of different data representations and machine learning classifiers on automatically identifying PoP.

No unique strategy

The study finds that there is no one-size-fits-all solution for data representation and classifier selection, and a tailored combination is needed for each principle.

Principles of Persuasion can be detected automatically

Machine learning models are created to automatically detect PoP with confidence levels ranging from 0.7306 to 0.8191 for AUC-ROC.

Friendlyness is all

The research emphasizes the importance of user-friendly and interpretable models for end-users.

The right strategy is Random Forest

Tree-based models, particularly Random Forest, are highlighted as the preferred option due to their effectiveness and clarity, achieving an AUC-ROC of 0.859842.

Page updated

Google Sites

Report abuse