Authentic Learning of Machine Learning in Cybersecurity with Portable Hands-on Labware

Security threats are evolving and getting more hidden and complicated. Detecting malicious security threats and attacks has become a huge burden to our cyberspace. We should apply proactive prevention and early detection of security vulnerabilities and threats rather than patching security holes afterward. To analyze the huge amount of data to find out suspicious behaviors, threat patterns, and vulnerabilities and to predict and prevent future cybersecurity threats is a challenge. Machine Learning (ML) is a powerful instrument to take up such challenges. Authentic learning has gained popularity in recent years to teach cybersecurity topics. The authentic learning approach creates an engaging and motivating learning environment that encourages all students in learning emerging technologies with hands-on laboratory practice on real-world topics, where each topic consists of a series of progressive sub-labs: a pre-lab, hands-on lab activity, and a student add-on post-lab. Many schools offer ML courses and cybersecurity courses in their computing curriculum; however, authentic learning-based ML into cybersecurity curriculum is not presently commonplace. There is a scarcity of open-source portable hands-on labware for the authentic learning of ML in Cybersecurity (MLC). Challenges in offering authentic learning-based MLC resources commonly include high costs of infrastructure, configuration difficulties of open source applications, a shortage of qualified faculty and technical staff, and the time constraints associated with developing open-source materials. To overcome these difficulties, this project proposes the development of a cyber workforce using authentic learning of MLC topics through the set of real-world cybersecurity learning modules with a pre-lab, lab activity, post-add-on lab learning cycle. The proposed portable labware will be designed, developed, and deployed on the open-source Google CoLaboratory (CoLab) environment where learners can access and practice all labs interactively with browsers anywhere and anytime without tedious installation and configuration. Also, the proposed hands-on lab modules will support a wide audience to effectively learn the subjects and result in more efficient student learning and engagement. This project will help to enhance the cybersecurity curricula across computing disciplines integrated with data science, engage student's active learning and problem-solving capability.

Figure 1: Machine learning algorithm applications in Cybersecurity

Machine Learning for Cybersecurity Modules

M0. Getting Started with CoLab on ML for CyberSecurity

M1. Naive Bayes for spam email filtering

M2. Logistic Regression for financial fraud prediction

M3. Neural network algorithms for network DOS detection

M4. Convolutional Neural Network (CNN) for CAPTCHA Bypass

M5. Decision Tree for Website Phishing

M6. Deep learning for malware classification and protection

M7. Support Vector Machine (SVM) for anomaly-based intrusion detection

M8. K-Means clustering for ransomware detection

M9. Decision Tree for malicious web application detection (malicious pages, URLs, HTTP requests)

M10. K-Nearest Neighbors (KNN) classification for user behavior anomaly

Machine learning algorithms come in many shapes and forms, but most of them perform classification, regression, and clustering tasks. The real-world cases are selected from OWASP open source projects on various common cybersecurity vulnerabilities and threat cases with relevant open-source datasets. The modules are designed in a way that students can not only learning ML algorithms but also its application in cybersecurity through hands-on labs based on real-world examples.

We design and develop 10 ML for cybersecurity modules selected from these three categories and apply them in various real-world cybersecurity cases for threat prediction, prevention, and security vulnerability protection.

Acknowledgement

The work is partially supported by the National Science Foundation under collaborative research awards: NSF Award# is 2100134, # 2100115 and #2433800 (September 1, 2021 – August 31, 2024)

Collaborative Research: Broadening SaTC: EDU: Authentic Learning of Machine Learning in Cybersecurity with Portable Hands-on Labware

Tuskegee University and University of West Florida