1. Playing Around with Existing Hyperparameters:
We began by experimenting with the default hyperparameters in the IonQ model to understand their impact on performance. These included learning rate, batch size, and epochs. The goal was to tune these values to maximize classification accuracy without overfitting or underfitting the model.
We tried to increase the number of convolution and pooling layers in the ansatz and tested the effect of varying entanglement depth in the quantum circuits, expecting deeper entanglement to enhance the model's ability to capture complex data patterns.
Additionally, we explored the BrickworkLayoutAnsatz, which creates a structured pattern of entanglement between qubits. This layout was tested with different block sizes and layer depths to find a balance between complexity and performance. The BrickworkLayoutAnsatz was particularly interesting because it allowed for structured, moderate entanglement, potentially offering a different balance compared to QCNN and QAOA.
2. Researching the Differences Between Ansatz QCNN, QAOA, Brickwork and AngleEncoder Layout:
What it does: The AngleEncoder is responsible for encoding classical data into a quantum state. It achieves this by applying rotation gates (like Ry) to the qubits based on the input data. The depth of the entanglement between qubits can be adjusted to improve the model’s ability to capture complex correlations.
Strengths: The AngleEncoder is excellent at creating a flexible and efficient way to embed classical data into quantum circuits. Its tunable entanglement depth allows it to capture intricate relationships within the data.
Where it shines: It works well in general quantum circuits where efficient encoding of data is critical. It helps capture important patterns in the data early on in the pipeline, which can improve overall accuracy in combination with other components.
QCNN (Quantum Convolutional Neural Network):
QCNN is inspired by classical CNNs and leverages a hierarchical structure to down-sample quantum states, reducing the dimensionality while capturing essential information.
It uses layers of entangling gates and pooling operations to process quantum data.
Designed for applications like image classification, it excels in extracting hierarchical features, making it a strong candidate for structured data tasks like MNIST.
What it does: Inspired by classical CNNs, the QCNNAnsatz uses a hierarchical structure with convolutional and pooling layers applied to quantum states. This enables it to process quantum data hierarchically, reducing dimensionality while extracting important features.
Strengths: The hierarchical structure makes QCNN particularly strong in image-like data (such as MNIST), where patterns and features at different scales need to be captured. Its ability to down-sample quantum data while preserving essential information gives it an edge in structured data tasks.
Where it shines: QCNNAnsatz excels at handling tasks involving structured or hierarchical data, such as image classification, where the model needs to capture multi-scale patterns and relationships.
QAOA (Quantum Approximate Optimization Algorithm):
QAOA is primarily designed for optimization problems and uses alternating layers of cost and mixing Hamiltonians.
Its goal is to find an approximate solution to combinatorial optimization problems by minimizing a given objective function.
While highly effective for optimization tasks, it can be adapted for classification problems, though it requires careful parameter tuning to perform well in this domain.
What it does: QAOA is primarily designed for solving optimization problems. It uses alternating layers of cost and mixing Hamiltonians to minimize an objective function, approximating the optimal solution to a combinatorial optimization problem.
Strengths: QAOA is effective at finding approximate solutions to optimization problems and can adapt to different scenarios through careful tuning of parameters like depth and angles.
Where it shines: It’s most effective in optimization tasks, rather than data classification. While it can be adapted for classification, it’s less naturally suited to capturing data patterns in the same way that QCNN or BrickworkLayout do.
BrickworkLayoutAnsatz:
This ansatz constructs a brickwork layout structure, where qubits are entangled in a structured, layered pattern.
It allows a flexible number of layers and block sizes, which we increased to test its capacity for feature extraction.
This layout was expected to perform well for tasks requiring moderate entanglement, balancing global and local quantum interactions in the data.
What it does: The BrickworkLayoutAnsatz creates a specific entanglement pattern between qubits, organized in a brickwork layout. It layers entangling operations and rotation gates to process quantum data, and its structure makes it straightforward to scale with the number of qubits.
Strengths: BrickworkLayoutAnsatz offers a balance between circuit complexity and performance. The entanglement it creates helps capture important relationships between qubits without becoming overly complex or inefficient.
Where it shines: It’s particularly effective for tasks that require a good balance between complexity and generalization, making it well-suited for classification tasks where too much complexity can lead to overfitting, but sufficient entanglement is needed to detect important data features.
AngleEncoder: Great for efficiently encoding classical data into quantum circuits with tunable entanglement depth.
BrickworkLayoutAnsatz: Balances complexity and performance, ideal for general classification tasks.
QCNNAnsatz: Excels in hierarchical data (like images) by mimicking the structure of classical CNNs to capture multi-scale patterns.
QAOAAnsatz: Best suited for optimization problems, struggles with image-based classification.
3. Determining the Better Ansatz:
We tested the QCNNAnsatz, QAOAAnsatz, and BrickworkLayoutAnsatz under similar conditions to determine which one performs better for the BinaryMNISTClassifier task.
QCNN was hypothesized to perform better on image data due to its hierarchical structure, which mimics convolutional layers used in classical image classification models.
QAOA, being more focused on optimization, was expected to struggle unless the problem was framed in an optimization context.
BrickworkLayoutAnsatz was expected to offer a balance between complexity and performance, with structured entanglement potentially capturing the patterns in MNIST data effectively.
4. Testing and Training:
We trained the QCNN, QAOA, and BrickworkLayout models using a 300-sample training set and a 100-sample test set. The training process was automated using the IonQVision package. All models were trained under the same hyperparameters:
After training, the performance of each model was evaluated based on the accuracy achieved on the unseen test set.
%%time
# Get a (pre-processed) training and test set
train_set, test_set = classifier.get_train_test_set(train_size=300, test_size=100)
# Configure model training hyperparameters
config = {
"epochs": 10, # Number of epochs: 10
"lr": 0.01, # Learning rate: 0.01
"batch_size": 55, # Batch size: 55
"betas": (0.9, 0.99), # Adam optimizer betas
"weight_decay": 1e-3, # Regularization term to prevent overfitting
"clip_grad": True, # Gradient clipping to avoid gradient explosion
"log_interval": 6, # Log interval for progress updates
}
classifier.train_module(train_set, test_set, config)
classifier.plot_training_progress()
5. Conclusion:
QCNNAnsatz outperformed both QAOAAnsatz and BrickworkLayoutAnsatz in the classification task. The QCNN's structure allowed it to capture hierarchical features in the MNIST data more effectively, leading to better test accuracy.
BrickworkLayoutAnsatz showed promising results, offering a good balance between complexity and generalization. Its structured entanglement helped capture important patterns in the data, and with further tuning, it could be a strong contender.
QAOA, although a powerful tool for optimization problems, was less suited for this image-based classification task. The structure of QAOA was less effective at identifying patterns in the data that QCNN and BrickworkLayout could more easily detect due to their entanglement patterns.
Ultimately, QCNNAnsatz proved to be the most effective for BinaryMNISTClassifier tasks, particularly due to its ability to handle hierarchical data more naturally. The BrickworkLayoutAnsatz provided a good alternative by offering moderate complexity and structured entanglement, and it performed competitively when balanced with the right number of layers and block sizes. Adjusting hyperparameters like the number of layers and entanglement depth further enhanced performance, but maintaining a balance between complexity and generalization was key.
During our testing and experimentation, we initially hypothesized that combining QCNNAnsatz, BrickworkLayoutAnsatz, and QAOAAnsatz in a hybrid model might lead to improved accuracy by leveraging the strengths of each ansatz. The idea was to capture more diverse quantum features, hoping for a more robust model. However, in practice, this approach did not yield the expected results.
The hybrid model, combining all three ansatzes, performed poorly in comparison to using QCNNAnsatz or BrickworkLayoutAnsatz alone. While QCNNAnsatz consistently outperformed the others, capturing hierarchical features in the MNIST dataset and providing better accuracy during testing, the hybrid combination introduced too much complexity. This complexity likely interfered with the model's ability to generalize and capture meaningful patterns in the data, causing the test accuracy to plateau or even decline.
The BrickworkLayoutAnsatz, on its own, showed promising results, striking a good balance between complexity and generalization. It could still be optimized further, but it was clear that the structured entanglement in this ansatz was beneficial for capturing patterns in the data. On the other hand, QAOAAnsatz, while powerful for optimization tasks, struggled to adapt effectively to the image classification problem, lagging behind both QCNNAnsatz and BrickworkLayoutAnsatz.
Ultimately, this experiment highlighted that combining multiple quantum ansatzes can sometimes lead to diminishing returns, especially when their architectures and strengths don't complement each other for the given task. It reaffirmed that QCNNAnsatz is the most effective for classification tasks like MNIST, and further optimization should focus on tuning hyperparameters within a simpler model rather than adding more complexity.