Research Question: How effective are LAD methods in detecting attacks in cloud systems?
Baseline Evaluation
We investigate the effectiveness of LAD methods in cloud systems without considering distribution shifts under each attack scenario.
Answer to Research Question:
We can see that both prediction-based LAD methods (e.g., DeepLog and LogAnomaly) and reconstruction-based methods (e.g., CHIDS and AE) demonstrate effective detection capabilities for cloud attacks with average F1 score over 0.9. This indicates current methods can handle in-distribution scenarios. Our implemented VAE achieved the best results, reaching detection rates of approximately 0.99 for each attack.
However, some attacks remain challenging for some LAD baselines and can raise security issues. Specifically, prediction-based methods (including DeepLog and LogAnomaly) generally yield lower results for Denial of Service attacks within cloud applications, achieving F1 scores of 0.8230 and 0.8663, respectively. We conjecture that DoS attacks inherently carry less semantic information compared to other attacks and the system calls of DoS attacks are often similar to normal ones.