In this lab, we will:
Load and preprocess the AndroidMalware dataset
Implement classical SVM using scikit-learn
Implement QSVM in PennyLane
Compare SVM and QSVM in terms of accuracy
Step 1: Install and import libraries
We can download the dataset and upload it to the Google Drive.
The path to load the dataset is drive-Mydrive-Your folder 1(optional)-Your folder 2(optional)- name of the datasets.csv
Before the training, we have to preprocess the dataset so it can be trained on our machine learning model.
Usually, we will drop the empty rows and standardize the data.
If we have a multidimensional feature space, PCA is used to condense the information and make the training easier.
Besides, if we want to train on the quantum model, we should scale the data because we are using angle embeddings, so angles outside the range [-π, π] will be mapped back into this interval.
We can directly load the dataset into our classical model. They are already built in the scikit-learn library.
The first part of our quantum computing is setting up the number of qubits. Usually, it equals the dimensions of the features.
Then we load the data into our quantum model.
Firstly, we use a Hadamard gate to map the initial states |0> into a superposition for later data embedding.
Then we use rotation gates to inject data.
We can also use CNOT gates to strengthen the connections between features.
In our quantum kernel, the key lies in our matrix operations.
So, we apply our feature map to the first input, and apply the adjoint with the second feature-mapped input.
Then measuring the expected output, we would get the similarity between these two inputs.
This is the kernel trick in quantum kernel
Then we find our weights and bias in the training set, and use these parameters in our train-test matrix to calculate the similarity.
The sign of our prediction indicates the class.
details