Link to dataset: https://www.kaggle.com/datasets/hasibullahaman/traffic-prediction-dataset
Link to code:
1 . Environment Setup: Install and import the required PennyLane and classical ML libraries needed to run the quantum and classical models.
2. Library Imports: Import all necessary modules: Scikit-learn for classical ML, PennyLane for quantum conputing, as well as standard data science libraries.
3.Data Loading: Mount Google Drive and load the Traffic dataset into a pandas DataFrame.
3. Data Preprocessing: Parse the 'Time' column into a continuous decimal hour between 0 and 24 and map the 'Day of the Week' column to numerical values from 0 to 6. Finally, map the target labels to numerical values 0-3 too, with 0 being "low" traffic to 3 being "heavy" traffic.
4. Split and Scale: The eight input features are selected and split into training (80%) and test (20%) sets using stratified sampling to preserve the class distribution. Features are then standardized using StandardScaler, which centers each feature to zero mean and unit variance since MLP and the quantum model are sensitive to the scale of the inputs.
5. Feature Selection: Quantum circuits are limited by the number of qubits available, so with only n_qubits= 4, only 4 features can be fed into the quantum and the hybrid models. We use a random forest here as a feature importance scorer to identify the 4 most informative features out of the total 8.
6. MLP training and Evaluation: A two hidden layer MLP with 64 and 32 neurons is trained on the full features set. We set it to halt training as soon as loss stops improving and record the time and accuracy it achieves.
7. VQC setup: We define the quantum circuit with 4 qubits, one per input feature. Classical features are encoded into rotation angles using AngleEmbedding, then the StrongEntanglingLayers applies parametrized entangling gates across all qubits. Each of the 4 qubits returns a PauliZ expectation value. The softmax function then converts these values into class probabilities.
8. VQC training: The VQC is trained for 60 epochs. Each Epoch selects a random batch of 32 samples and evaluates the quantum circuit over it.
9. VQC evaluation: The trained circuit is then run over the full test set. Each sample produces 4 expectation values which are passed through softmax, and the class with the highest probability is selected as the prediction.
10. Hybrid model definition: The HybridNet class defines the full end-to-end hybrid architecture. It starts with a classical front-end that compresses the 8 input features down to 4 values through two linear layers. These 4 values are then passed through a PennyLane quantum circuit and then through a final linear layer that maps the 4 quantum outputs to class logits.
11. Data preparation: The scaled numpy arrays are converted into PyTorch tensors with the appropriate data types.
12. Hybrid training: The full HybridNet is trained for 30 epochs using PyTorch's Adam optimizer and CrossEntropyLoss. At each step, the gradients flow backwards through the classical output head, through the quantum layer via TorchLayer's automatic differentiation, and all the way back through the classical front-end. This jointly optimizes all classical and quantum parameters in a single pass.
13. Hybrid evaluation: The model is then evaluated through the testing set and the class with the highest logit is selected as the prediction.
14. Accuracy comparison: Finally, we summarize the accuracy of each of the three models into a bar chart using matplotlib.