Machine Learning (ML) can learn and find threat patterns more effectively by analyzing large volumes of data to detect malware in encrypted traffic, find insider threats, predict suspicious anomalies and behaviors and activities, detect malware, secure and protect data in the cloud, and networks. Many machine learning algorithms can be applied in cybersecurity such as unsupervised clustering and supervised classification, and neural network-oriented deep learning algorithms.
In this Getting started learning module, we introduce the Google open source CoLab collaborative learning platform for ML with Python for cybersecurity via a simple Hello World example.
Each of the learning modules will focus on specific ML for a cybersecurity case study with the same hands-on learning environment.
For the Hello World example, we will use a malicious URL dataset as our dataset. You can directly download the dataset here.
Malicious URLs became one of the common problems in cybersecurity. Malicious URLs can be sent to users by emails, text messages, pop-ups, or unreliable ads. The end result is often a download of malware, spyware, ransomware, infected accounts and all troubles these threats bring. Malicious websites are a big concern because it is a problem to analyze and index each URL on a blacklist one by one. There are common traits of malicious URLs websites. For example, the website automatically asks users to run software or download a file when not expected to do so. The malicious website may also indicate user's computer or device is being infected with malware or browser extensions or software are out-of-date and need updates.