Edwin Valdez

Network Intrusion Detection System with Machine Learning Approach

Introduction

This project aims to create a Network intrusion Detection System (NIDS) with machine learning capabilities. The system would monitor and detect malicious activity on the network that can lead to future attacks. The system would perform a classification approach with the neural network algorithm that would be able to predict if the network is malicious or normal. NIDS is a solution to the current increasing network attack issues to devices that can further damage a company. Many companies rely on services over the internet with the use of applications and transactions online. Therefore, this model system aims to provide another layer of protection/detection from possible network attacks that can affect those services.

Talking more about Network Intrusion Detection System with Machine Learning Approach

This YouTube video shows more information about the topic, some objective behind developing this project, and more detail that enclosed the overview of this project.

Design Architecture

In this architecture, we are looking to input our dataset and processing all that data in python for cleaning y transformation. After that, we are applying the transformed data to our neural network algorithm to get our output.

Dataset Information

Overview of the data

For this project, we are going to use UNSW-NB15 Dataset.
This dataset contains raw network packets created by the IXIA PerfectStorm tool in the Cyber Range Lab of the Australian Centre for Cyber Security (ACCS).
The total of records on this dataset is around two million, with a total of 49 features.

Label Distribution on the datasets

In the picture above, we can see the distribution of normal and abnormal behavior (0= normal, 1= abnormal). In addition, we have another subcategory when the dataset classifies the type of attack between nine categories.

The graph that shows the distribution between the sub attack category

The graph shows how to balance our data is. We have around two million network traffic records. By doing some data exploration, we can find the distribution. The graph shows that we have enough data in each subcategory to build an optimal model.

Progress update for the second Phase

Presentation of the project progress

The YouTube presentation shows some findings and techniques used on the model to analyze the network data traffic. Phase II of my project would focus on data exploration, cleaning, and transformation. In addition, we have some of the initials result when we apply data to our Keras classifier model. The presentation of this phase is also available in my Github where the code implementation is also available.

Data Exploration, cleaning, transformation, and model construction

Creating a correlation Matrix would give us a view between what is the correlation between features, which would give us a glance if our model will work.
The graph shows some correlation between features, then the graph would support our decision of doing Neural Network model compared to other approaches like random forest model.
The correlation graph shows all the new labels, transformation, and balance data that makes our dataset ready for our model.

Data Cleaning- Heat Map for Null values

The figure above shows which feature contains null values that can affect our model and prediction accuracy. The two features that we can see that have most of the null values are "ct_flw_http_mthd " and "is_ftp_login." The challenge for these data is to determine if it is significant for our prediction. It describes a network communication or/and it is just noise that we should remove for our training data sets.

Data Exploration- Detection of Noise from the data in Attack Categories

The graphs above show that we have some packet behavior for each type of network attack in terms of attack categories, including the Normal category. The graph shows the number of packets from source and destination host, and we can see some spikes from some network traffic. For example, the packets transfer from destination and source host from the network traffic classified as Exploits. We can see some packets the goes over the mean number of packet usually for network traffic classified as Exploit.

Some Initial Results from our Neural Network Model

In this project, I am using Keras Classifier algorithm to set up my neural network model. The initial configuration of my model contains 25 epochs and a batch size of 15. In addition, my setup includes six hidden layers with "relu" activation. As we were expected from our initial results, I consider the results to be off, and we are going to run some epoch and play with the number of hidden layers and batch size to reach a better accuracy where we can align the results with our validation data. Furthermore, I am going to perform a deep analysis of the data to determine which piece of network traffic I should keep and/or transform for better accurate results. However, I found traffic data that will not be significant to our end prediction, which I will remove from our training datasets.

Final Results

For the presentation and source code of this final phase of the project, please visit the GitHub repository of this project.

Initial training- Identify normal and abnormal network traffic

The first part is to analyze the distribution between normal (represented by 0) and abnormal network traffic (represented by 1). At first glance, we have imbalanced data that might affect the performance of our model. Therefore, it was necessary to perform some extra configurations and methods to balance the dataset.

First results in our first model test

According to the results shown for loss, PRC, precision, and recall, our initial data model's results shows that we need to make extra adjustment seems our unbalance data affects our model's performance.

In order to overcome the unbalance issue with the dataset, I applied different types of methods including adding Bias values, class weight, and resample the dataset to fix the unbalance issue.

Analysis of the Confusion Matrix

In this case, the matrix shows that I have relatively few false positives, meaning that there was relatively little legitimate network traffic that was incorrectly flagged. However, the trade-off may be preferable because false negatives would allow malicious network traffic to go through, whereas false positives may cause legitimate traffic to be blocked from the company network.

Last results after balancing the data for my model

The results show that our validation and train accuracy align with the increasing epochs. Therefore, our model is working better than our initial results which suggest that Bias and class weights help to improve our results. It is important to note that for the scale of our graph, we do not see any oscillation but it is because the oscillation is small that we would need to zoom in into the graph to see better the changes over each epoch.

Confusion Matrix after the adjustment of the Dataset

The confusion matrix shows that more network traffic was classified correctly as normal traffic and more abnormal traffic was classified correctly compared to our initial model.

Classification of the Different type of attack categories

Network Attack Category Distribution

One of the goal for this project was to correctly classify what type of malicious network traffic is in the dataset. The graph on the left shows the nine different categories and the percentage in our sample data.

Accuracy comparison between training and validation dataset

As the figures shows, we have overfitting on our dataset. Therefore, we will need to increasing the hidden layers and some extra modification to increase the performance of our model.

Loss comparison between training and validation dataset

As the results from accuracy on the datasets, the loss graph that shows the comparison between training and validation dataset, shows overfitting on our model. At this point, we will need to work more on the configuration of our model.

Accuracy comparison between training and validation dataset after modification on our model

At the end, our model performance was acceptable and it exceeds the expectation considering the initial model's results and the data distribution. The graph shows the configuration and class weight configuration helps the model accuracy for better results. In addition, it increase the precision, recall and AUC scores.

Confusion Matrix for our final Dataset configuration

The Confusion Matrix is a great figure that tells us which network attack categories our models is being able to predict. Unfortunately, our model predicts four of the nine models that we are trying to predict.

Metrics Results after our Models performance

To evaluate our model, I considered the best metric to consider are precision, recall, f1-score, accuracy, AUC and PRC. In the figure on the left, we got the precision of our model which would be also tell us the recall and f1-score. The highest score that we were be able to gather was in Generic category.

The last method implemented to increase the performance of our model

Accuracy results for training and validation Dataset

The last method applied consists on considering not just attack categories but also normal traffic. By doing some extra configurations, we were be able to get around the same performance and accuracy results.

Loss results for training and validation Dataset

The Loss graph shows predictions error of Neural Net. In this case, our model's performance indicates that our result are good and align to the other metrics' results. This is a good indication that our model is doing a decent job considering the data distribution of our dataset.

Confusion Matrix Results

The results in the Confusion Matrix are really similar to the one we got before in previous models. We still have four out of nine give us some prediction using the latest model. In brief, this latest method that we develop had similar results than the previous model used during this case. Therefore, due to our dataset characteristic, we were be able to increase our model performance and predict some network traffic attack category.

Conclusion of Results of this project

We had unbalanced data that affect our model's performance. However, we were able to apply some extra configuration to increase our model's performance.
The model's design to detect normal versus abnormal network traffic performed great. In fact, it exceeded the expectation for this model since we had unbalanced data. The score for accuracy, precision, recall, and AUC above the 85%
The second model designed to predict what kind of malicious network traffic we have between the nine different categories had an issue effectively predicting all the network attack categories that we had available.
The second model was able to predict four of the nine categories.
The metric score that I used to evaluate our model was accuracy, precision, recall, and F1. The accuracy result for this model was 62%.
Several issues affected our model, such as the unbalanced data.
Another issue that negatively affected our model was that some of the network attack categories have similar behavior, which our model had a hard time distinguishing each category apart.
In order to identify each category better, it is necessary to increase our example data for each category, especially for those network attack categories with similar behavior like reconnaissance and scan attacks.

Limitations and further work

Find a dataset that contains balance records between the different types of network attack categories. Collect network traffic from different sources and create one effective network dataset.
To compare my model against other machine learning algorithms like random forests.
Be able to design this model to perform live network traffic analysis by using direct feeding for tools like Wireshark, Cisco, Palo Alto, etc.
Time to develop this project.

Github Repository

Visit the GitHub repository to access the datasets, programs, and updates on the project.

Reference:

Abdelhameed M, Dr. Nour M. ( Nov. 14, 2018) The UNSW-NB15 Dataset Description [Data set] . The University of New South Wales. https://www.unsw.adfa.edu.au/unsw-canberra-cyber/cybersecurity/ADFA-NB15-Datasets/
Cuelogic Technologies. ( May 13, 2019). Evaluation of Machine Learning Algorithms for Intrusion Detection System. Medium. https://medium.com/cuelogic-technologies/evaluation-of-machine-learning-algorithms-for-intrusion-detection-system-6854645f9211
Almseidin, M., Alzubi, M., Kovacs, S., & Alkasassbeh, M. ( 2020). Evaluation of Machine Learning Algorithms for Intrusion Detection System. Mutah University, Amman,Jordan. https://arxiv.org/ftp/arxiv/papers/1801/1801.02330.pdf
Ahmad, Z., Khan, A., Shiang, C., Abdullah, J & Ahmad, F. ( Oct. 16, 2020) Network intrusion detection system: A systematic study of machine learning and deep learning approaches. Wiley Online Library. https://onlinelibrary.wiley.com/doi/full/10.1002/ett.4150
MLK. (Oct. 30, 2019) Animated Explanation of Feed Forward Neural Network Architecture. MLK making AI simple. https://machinelearningknowledge.ai/animated-explanation-of-feed-forward-neural-network-architecture/
Sharma, Bikash. ( Sept. 4, 2019) Evaluating a Machine Learning Model. Skyl.ai. https://blog.skyl.ai/evaluating-a-machine-learning-model/
Stone, Mark. (2021, April 9). Intrusion Detection Systems (IDS) explained. AT&T. https://cybersecurity.att.com/solutions/intrusion-detection-system/ids-explained
Brownlee, Jason. (2016, Sept. 21). How To Improve Deep Learning Performance. Machine Learning Mastery. https://machinelearningmastery.com/improve-deep-learning-performance/
Wang, Chi-Feng. (2018, Aug.16). Different Ways of Improving Training Accuracy. Towards Data Science. https://towardsdatascience.com/different-ways-of-improving-training-accuracy-c526db15a5b2
Sharma, Pulkit. (2019, Nov. 7). 4 Proven Tricks to Improve your Deep Learning Model’s Performance. Analytics Vidhya. https://www.analyticsvidhya.com/blog/2019/11/4-tricks-improve-deep-learning-model-performance/

Page updated

Report abuse