Describe the purpose of PCA and how it reduces dimensionality while preserving variance.
Define covariance matrices, eigenvalues, eigenvectors, and their roles in PCA.
Explain how QPCA works using quantum states, density matrices, and the Variational Quantum Eigensolver.
Apply PCA and QPCA concepts to a real engineering dataset.
Interpret variance, reconstruction error, and runtime differences between classical and quantum approaches.
1. What is PCA?
Principal Component Analysis (PCA) is a method that identifies the directions in a dataset where the data varies the most. These directions are called principal components. PCA allows us to reduce many correlated variables down to only a few important ones.
When you have a dataset:
X ∈ ℝⁿ×ᵈ
• n = number of samples
• d = number of features
The goal of PCA is to find a set of new axes that capture as much variation as possible.
2. Covariance Matrix
The covariance matrix measures how each pair of features varies together. PCA begins by computing:
C = (1 / (n − 1)) · XᵀX
Where:
• Covariance values > 0 mean features increase together
• Covariance values < 0 mean one increases while the other decreases
• Values near 0 mean little relationship
This matrix C summarizes the variability structure of the dataset.
3. Eigenvalue and Eigenvector System
The next step is solving the eigenvalue equation:
C v = λ v
Where:
• v = eigenvector = principal direction
• λ = eigenvalue = amount of variance along v
Eigenvectors are sorted from largest λ to smallest λ.
The first eigenvector captures the largest variance, the second captures the next largest, and so on.
4. Dimensionality Reduction through Projection
Once we select the top k eigenvectors, we project the data into this reduced space:
X′ = X W
Where:
• W = [v₁ v₂ … vₖ] is the matrix of the top k principal components
• X′ is the lower-dimensional representation
For this project, we use k = 2, so data is reduced to two dimensions.
5. What is QPCA?
Quantum Principal Component Analysis (QPCA) extends PCA using quantum operations.
Instead of working directly with covariance , QPCA uses a quantum version of a covariance matrix called a density matrix.
ρ = (1 / N) ∑ᵢ₌₁ᴺ |xᵢ⟩⟨xᵢ|
This is mathematically similar to a classical covariance matrix.
Here, the data is encoded into quantum states.
QPCA attempts to find eigenvalues and eigenvectors of ρ.
6. Variational Quantum Eigensolver (VQE)
QPCA uses a hybrid quantum-classical algorithm called VQE:
VQE Starts with a parameterized quantum circuit.
Prepares a quantum state |ψ(θ)⟩.
Measure the expectation value of a Hermitian matrix that embeds C.
Uses a classical optimizer that updates θ to increase or decrease this expectation value.
Where the measurement stabilizes, |ψ(θ)⟩ approximates an eigenvector of C.
QPCA repeats this process twice to retrieve the first two principal components.
7. Dataset Used in This Module
We use the Yeh Concrete Compressive Strength Dataset, containing 1030 samples of concrete mixtures with features like:
Cement, Slag, Fly ash, Water, Superplasticizer, Coarse aggregate, Fine aggregate, Age
PCA and QPCA will reduce these numerical features to a 2D representation.