In our evaluation (the above Table), the performance of HRAT (the second row) differed significantly from the claims made in the paper, prompting us to conduct a detailed analysis. Indeed, its replicability was also proven challenging in another survey article[1].
In HRAT, the process of conducting adversarial attacks on graph structures is defined as a reinforcement learning process. Specifically, the researchers have defined four types of graph modification rules that maintain semantic consistency. Utilizing a DQN, these four operations are considered as potential actions to determine which one to execute. In each action, given that the graph can be represented as an adjacency matrix, a gradient calculation is performed across the entire graph to select the position with the highest gradient for the operation. If the operation successfully reverses the sample's label, a positive reward is obtained; otherwise, a negative reward is received, the magnitude of which correlates with the changes in the number of edges and nodes in the graph.
We summary the limitation about HRAT in four aspects:
(1) Memory-Intensive: HRAT cannot run in Malscan (Average and Concentrate) due to their memory requirements, which are four times greater than those of Malscan (Degree). When running Malscan (Degree) with KNN-1, the GPU memory (i.e., 46GB memory of L40) is exhausted approximately 30GB because the entire training set needs to be loaded into memory, along with the large adjacency matrix of the graph.
(2) Gradient-Dependent: tree-based classifiers such as AdaBoost and Random Forest are non-differentiable, rendering the gradient search technique used by the authors ineffective.
(3) Coarse Gradient Information: In addition to using gradient information from the model to guide the direction of optimization, the authors also calculate gradients on the adjacency matrix to estimate the position of perturbations. However, due to the enormous perturbation space, estimating the position of perturbations on the graph is coarse. This is because the values in the matrix are binary, making it difficult to treat the gradient of a value of 0 as the optimal result when the value is 1. This leads to the continuous accumulation of erroneous optimizations.
(4) Ineffective Reward Mechanism: The authors used the amount of perturbation (changes in the number of edges and nodes) as a reward to guide reinforcement learning. However, fewer perturbations do not necessarily indicate proximity to the decision boundary, which can also lead to the accumulation of errors.
[1] Olszewski, Daniel, et al. "" Get in Researchers; We're Measuring Reproducibility": A Reproducibility Study of Machine Learning Papers in Tier 1 Security Conferences." Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security. 2023.
BagAmmo employs a genetic algorithm (GA) that simulates targeted classifiers with a surrogate model and use Add Edge operator to modify the graph’s structure, optimizing the attack process.
First, we summary the limitation about BagAmmo in two aspects.
Then, we replicated it in two types of surrogate models in a white-box setting: the first is a GCN, and the second is an MLP. For convenience, we named them BagAmmo_G and BagAmmo_M, respectively. Their results are shown in the above table.
(1) Low efficient Exploration of Perturbations: Random strategies based on a single type of perturbation lose effectiveness significantly when applied to detectors with extensive feature dimensions, such as MalScan with 21,986 dimensions. Additionally, when an APK contains only a few modifiable user nodes, the Add Edge operator is limited to operating within a very small portion of the search space, further reducing the effectiveness of the exploration.
(2) Coarse-Grained Evaluation of Perturbation Effects: Using a linear surrogate model to standardize outputs from various models, whether binary or continuous, often fails to capture subtle, feature-level effects of critical perturbations, making it prone to local optima. This issue becomes more pronounced with robust tree-based models, which have step-like decision boundaries. As a result, it becomes difficult to differentiate between individuals with the same score during the genetic algorithm process, hindering effective optimization.
We replicated BagAmmo using a five-layer Graph Convolutional Network (GCN) designed to process edge data along with node in-degrees and out-degrees as input features. This GCN was trained on the same training set (in Section VII A) to learn from ground truth, serving as a white-box surrogate model. In this white-box setting, we have access to the output scores of the target model. We use the scores from both the surrogate model and the target model for selecting perturbations, thereby enhancing the attack's effectiveness.
The GCN architecture progressively increases feature dimensions from 16 to 128 across its layers, incorporating ReLU activations and dropout after each layer to prevent overfitting. Training was conducted on a GPU, utilizing a DataLoader with shuffled batches and the Adam optimizer with a learning rate of 0.001 across 100 epochs. We employed CrossEntropyLoss to evaluate performance metrics. Each training epoch included forward propagation, loss assessment, back-propagation, and parameter updates. Upon completion of the training, the model achieved an accuracy of approximately 80%.
To replicate another white-box setting of BagAmmo, we use an MLP as the surrogate model, consistent with our experimental setup. We still use the scores from both the surrogate model and the target model to select perturbations.