My main research interests

Quantum Machine Learning

Classical and Quantum Computational Learning Theory

Learning from Quantum Experiments

My preprints and publications

You can find all my publications on the arXiv. Here is an overview with some additional material (posters and videos) and the abstracts:

Online learning of quantum channels (Asad Raza, Matthias C. Caro, Jens Eisert, and Sumeet Khatri; arXiv preprint 2024)

Among recent insights into learning quantum states, online learning and shadow tomography procedures are notable for their ability to accurately predict expectation values even of adaptively chosen observables. In contrast to the state case, quantum process learning tasks with a similarly adaptive nature have received little attention. In this work, we investigate online learning tasks for quantum processes. Whereas online learning is infeasible for general quantum channels, we show that channels of bounded gate complexity as well as Pauli channels can be online learned in the regret and mistake-bounded models of online learning. In fact, we can online learn probabilistic mixtures of any exponentially large set of known channels. We also provide a provably sample-efficient shadow tomography procedure for Pauli channels. Our results extend beyond quantum channels to non-Markovian multi-time processes, with favorable regret and mistake bounds, as well as a shadow tomography procedure. We complement our online learning upper bounds with mistake as well as computational lower bounds. On the technical side, we make use of the multiplicative weights update algorithm, classical adaptive data analysis, and Bell sampling, as well as tools from the theory of quantum combs for multi-time quantum processes. Our work initiates a study of online learning for classes of quantum channels and, more generally, non-Markovian quantum processes. Given the importance of online learning for state shadow tomography, this may serve as a step towards quantum channel variants of adaptive shadow tomography. 

Hamiltonian Property Testing (Andreas Bluhm, Matthias C. Caro, and Aadil Oufkir; arXiv preprint 2024)

Locality is a fundamental feature of many physical time evolutions. Assumptions on locality and related structural properties also underlie recently proposed procedures for learning an unknown Hamiltonian from access to the induced time evolution. However, no protocols to rigorously test whether an unknown Hamiltonian is local were known. We investigate Hamiltonian locality testing as a property testing problem, where the task is to determine whether an unknown n-qubit Hamiltonian H is k-local or ε-far from all k-local Hamiltonians, given access to the time evolution along H. First, we emphasize the importance of the chosen distance measure: With respect to the operator norm, a worst-case distance measure, incoherent quantum locality testers require Ω(2^n) many time evolution queries and an expected total evolution time of Ω(2^n/ε), and even coherent testers need Ω(2^{n/2}) many queries and Ω(2^{n/2}/ε) total evolution time. In contrast, when distances are measured according to the normalized Frobenius norm, corresponding to an average-case distance, we give a sample-, time-, and computationally efficient incoherent Hamiltonian locality testing algorithm based on randomized measurements. In fact, our procedure can be used to simultaneously test a wide class of Hamiltonian properties beyond locality. Finally, we prove that learning a general Hamiltonian remains exponentially hard with this average-case distance, thereby establishing an exponential separation between Hamiltonian testing and learning. Our work initiates the study of property testing for quantum Hamiltonians, demonstrating that a broad class of Hamiltonian properties is efficiently testable even with limited quantum capabilities, and positioning Hamiltonian testing as an independent area of research alongside Hamiltonian learning. 

Information-theoretic generalization bounds for learning from quantum data (Matthias C. Caro, Tom Gur, Cambyse Rouzé, Daniel Stilck França, and Sathyawageeswar Subramanian; arXiv preprint 2023)

Learning tasks play an increasingly prominent role in quantum information and computation. They range from fundamental problems such as state discrimination and metrology over the framework of quantum probably approximately correct (PAC) learning, to the recently proposed shadow variants of state tomography. However, the many directions of quantum learning theory have so far evolved separately. We propose a general mathematical formalism for describing quantum learning by training on classical-quantum data and then testing how well the learned hypothesis generalizes to new data. In this framework, we prove bounds on the expected generalization error of a quantum learner in terms of classical and quantum information-theoretic quantities measuring how strongly the learner's hypothesis depends on the specific data seen during training. To achieve this, we use tools from quantum optimal transport and quantum concentration inequalities to establish non-commutative versions of decoupling lemmas that underlie recent information-theoretic generalization bounds for classical machine learning. Our framework encompasses and gives intuitively accessible generalization bounds for a variety of quantum learning scenarios such as quantum state discrimination, PAC learning quantum states, quantum parameter estimation, and quantumly PAC learning classical functions. Thereby, our work lays a foundation for a unifying quantum information-theoretic perspective on quantum learning. 

Learning quantum states and unitaries of bounded gate complexity (Haimeng Zhao, Laura Lewis, Ishaan Kannan, Yihui Quek, Hsin-Yuan Huang, and Matthias C. Caro; arXiv preprint 2023)

While quantum state tomography is notoriously hard, most states hold little interest to practically-minded tomographers. Given that states and unitaries appearing in Nature are of bounded gate complexity, it is natural to ask if efficient learning becomes possible. In this work, we prove that to learn a state generated by a quantum circuit with G two-qubit gates to a small trace distance, a sample complexity scaling linearly in G is necessary and sufficient. We also prove that the optimal query complexity to learn a unitary generated by G gates to a small average-case error scales linearly in G. While sample-efficient learning can be achieved, we show that under reasonable cryptographic conjectures, the computational complexity for learning states and unitaries of gate complexity G must scale exponentially in G. We illustrate how these results establish fundamental limitations on the expressivity of quantum machine learning models and provide new perspectives on no-free-lunch theorems in unitary learning. Together, our results answer how the complexity of learning quantum states and unitaries relate to the complexity of creating these states and unitaries. 

Dissipation-enabled bosonic Hamiltonian learning via new information-propagation bounds (Tim Möbus, Andreas Bluhm, Matthias C. Caro, Albert Werner, and Cambyse Rouzé; arXiv preprint 2023)

Reliable quantum technology requires knowledge of the dynamics governing the underlying system. This problem of characterizing and benchmarking quantum devices or experiments in continuous time is referred to as the Hamiltonian learning problem. In contrast to multi-qubit systems, learning guarantees for the dynamics of bosonic systems have hitherto remained mostly unexplored. For m-mode Hamiltonians given as polynomials in annihilation and creation operators with modes arranged on a lattice, we establish a simple moment criterion in terms of the particle number operator which ensures that learning strategies from the finite-dimensional setting extend to the bosonic setting, requiring only coherent states and heterodyne detection on the experimental side. We then propose an enhanced procedure based on added dissipation that even works if the Hamiltonian time evolution violates this moment criterion: With high success probability it learns all coefficients of the Hamiltonian to accuracy ε using a total evolution time of O(log(m)/ε^2). Our protocol involves the experimentally reachable resources of projected coherent state preparation, dissipative regularization akin to recent quantum error correction schemes involving cat qubits stabilized by a nonlinear multi-photon driven dissipation process, and heterodyne measurements. As a crucial step in our analysis, we establish our moment criterion and a new Lieb-Robinson type bound for the evolution generated by an arbitrary bosonic Hamiltonian of bounded degree in the annihilation and creation operators combined with photon-driven dissipation. Our work demonstrates that a broad class of bosonic Hamiltonians can be efficiently learned from simple quantum experiments, and our bosonic Lieb-Robinson bound may independently serve as a versatile tool for studying evolutions on continuous variable systems. 

Classical Verification of Quantum Learning (Matthias C. Caro, Marcel Hinsche, Marios Ioannou, Alexander Nietner, and Ryan Sweke; ITCS 2024 ; arXiv preprint 2023)

Quantum data access and quantum processing can make certain classically intractable learning tasks feasible. However, quantum capabilities will only be available to a select few in the near future. Thus, reliable schemes that allow classical clients to delegate learning to untrusted quantum servers are required to facilitate widespread access to quantum learning advantages. Building on a recently introduced framework of interactive proof systems for classical machine learning, we develop a framework for classical verification of quantum learning. We exhibit learning problems that a classical learner cannot efficiently solve on their own, but that they can efficiently and reliably solve when interacting with an untrusted quantum prover. Concretely, we consider the problems of agnostic learning parities and Fourier-sparse functions with respect to distributions with uniform input marginal. We propose a new quantum data access model that we call "mixture-of-superpositions" quantum examples, based on which we give efficient quantum learning algorithms for these tasks. Moreover, we prove that agnostic quantum parity and Fourier-sparse learning can be efficiently verified by a classical verifier with only random example or statistical query access. Finally, we showcase two general scenarios in learning and verification in which quantum mixture-of-superpositions examples do not lead to sample complexity improvements over classical data. Our results demonstrate that the potential power of quantum data for learning tasks, while not unlimited, can be utilized by classical agents through interaction with untrusted quantum entities. 

The power and limitations of learning quantum dynamics incoherently (Sofiene Jerbi, Joe Gibbs, Manuel S. Rudolph, Matthias C. Caro, Patrick J. Coles, Hsin-Yuan Huang, and Zoë Holmes; arXiv preprint 2023)

Quantum process learning is emerging as an important tool to study quantum systems. While studied extensively in coherent frameworks, where the target and model system can share quantum information, less attention has been paid to whether the dynamics of quantum systems can be learned without the system and target directly interacting. Such incoherent frameworks are practically appealing since they open up methods of transpiling quantum processes between the different physical platforms without the need for technically challenging hybrid entanglement schemes. Here we provide bounds on the sample complexity of learning unitary processes incoherently by analyzing the number of measurements that are required to emulate well-established coherent learning strategies. We prove that if arbitrary measurements are allowed, then any efficiently representable unitary can be efficiently learned within the incoherent framework; however, when restricted to shallow-depth measurements only low-entangling unitaries can be learned. We demonstrate our incoherent learning algorithm for low entangling unitaries by successfully learning a 16-qubit unitary on ibmq_kolkata, and further demonstrate the scalabilty of our proposed algorithm through extensive numerical experiments.  

Learning Quantum Processes and Hamiltonians via the Pauli Transfer Matrix (arXiv preprint 2022)

Learning about physical systems from quantum-enhanced experiments, relying on a quantum memory and quantum processing, can outperform learning from experiments in which only classical memory and processing are available. Whereas quantum advantages have been established for a variety of state learning tasks, quantum process learning allows for comparable advantages only with a careful problem formulation and is less understood. We establish an exponential quantum advantage for learning an unknown n-qubit quantum process N. We show that a quantum memory allows to efficiently solve the following tasks: (a) learning the Pauli transfer matrix of an arbitrary N, (b) predicting expectation values of bounded Pauli-sparse observables measured on the output of an arbitrary N upon input of a Pauli-sparse state, and (c) predicting expectation values of arbitrary bounded observables measured on the output of an unknown N with sparse Pauli transfer matrix upon input of an arbitrary state. With quantum memory, these tasks can be solved using linearly-in-n many copies of the Choi state of N, and even time-efficiently in the case of (b). In contrast, any learner without quantum memory requires exponentially-in-n many queries, even when querying N on subsystems of adaptively chosen states and performing adaptively chosen measurements. In proving this separation, we extend existing shadow tomography upper and lower bounds from states to channels via the Choi-Jamiolkowski isomorphism. Moreover, we combine Pauli transfer matrix learning with polynomial interpolation techniques to develop a procedure for learning arbitrary Hamiltonians, which may have non-local all-to-all interactions, from short-time dynamics. Our results highlight the power of quantum-enhanced experiments for learning highly complex quantum dynamics. 

Out-of-distribution generalization for learning quantum dynamics (Matthias C. Caro, Hsin-Yuan Huang, Nicholas Ezzell, Joe Gibbs, Andrew T. Sornborger, Lukasz Cincio, Patrick J. Coles, and Zoë HolmesNature Communications 14, 3751 (2023); arXiv preprint 2022; see also the corresponding Poster)

Generalization bounds are a critical tool to assess the training data requirements of Quantum Machine Learning (QML). Recent work has established guarantees for in-distribution generalization of quantum neural networks (QNNs), where training and testing data are assumed to be drawn from the same data distribution. However, there are currently no results on out-of-distribution generalization in QML, where we require a trained model to perform well even on data drawn from a distribution different from the training distribution. In this work, we prove out-of-distribution generalization for the task of learning an unknown unitary using a QNN and for a broad class of training and testing distributions. In particular, we show that one can learn the action of a unitary on entangled states using only product state training data. We numerically illustrate this by showing that the evolution of a Heisenberg spin chain can be learned using only product training states. Since product states can be prepared using only single-qubit gates, this advances the prospects of learning quantum dynamics using near term quantum computers and quantum experiments, and further opens up new methods for both the classical and quantum compilation of quantum circuits.

Dynamical simulation via quantum machine learning with provable generalization (Joe Gibbs, Zoë Holmes, Matthias C. Caro, Nicholas Ezzell, Hsin-Yuan Huang, Lukasz Cincio, Andrew T. Sornborger, and Patrick J. Coles; Phys. Rev. Research 6, 013241 (2024); arXiv preprint 2022; see also the corresponding Poster)

Much attention has been paid to dynamical simulation and quantum machine learning (QML) independently as applications for quantum advantage, while the possibility of using QML to enhance dynamical simulations has not been thoroughly investigated. Here we develop a framework for using QML methods to simulate quantum dynamics on near-term quantum hardware. We use generalization bounds, which bound the error a machine learning model makes on unseen data, to rigorously analyze the training data requirements of an algorithm within this framework. This provides a guarantee that our algorithm is resource-efficient, both in terms of qubit and data requirements. Our numerics exhibit efficient scaling with problem size, and we simulate 20 times longer than Trotterization on IBMQ-Bogota.

On the generators of quantum dynamical semigroups with invariant subalgebras (Markus Hasenöhrl and Matthias C. Caro; Open Systems & Information Dynamics Vol. 30, No. 01, 2350001 (2023); arXiv preprint 2022)

The problem of characterizing GKLS-generators and CP-maps with an invariant von Neumann subalgebra A appeared in different guises in the literature. We prove two unifying results: First, we show how to construct a normal form for A-invariant GKLS-generators, if a normal form for A-invariant CP-maps is known - rendering the two problems essentially equivalent. Second, we provide a normal form for A-invariant CP-maps if A is atomic (which includes the finite-dimensional case). As an application we reproduce several results from the literature as direct consequences of our characterizations and thereby point out connections between different fields.

Generalization in quantum machine learning from few training data (Matthias C. Caro, Hsin-Yuan Huang, M. Cerezo, Kunal Sharma, Andrew Sornborger, Lukasz Cincio, and Patrick J. Coles; Nature Communications 13, 4919 (2022); arXiv preprint 2021)

Modern quantum machine learning (QML) methods involve variationally optimizing a parameterized quantum circuit on a training data set, and subsequently making predictions on a testing data set (i.e., generalizing).  In this work, we provide a comprehensive study of generalization performance in QML after training on a limited number N of training data points. 

We show that the generalization error of a quantum machine learning model with T trainable gates scales at worst as √T/N. When only K<<T gates have undergone substantial change in the optimization process, we prove that the generalization error improves to √K/N. 

Our results imply that the compiling of unitaries into a polynomial number of native gates, a crucial application for the quantum computing industry that typically uses exponential-size training data, can be sped up significantly. We also show that classification of quantum states across a phase transition with a quantum convolutional neural network requires only a very small training data set. Other potential applications include learning quantum error correcting codes or quantum dynamical simulation. Our work injects new hope into the field of QML, as good generalization is guaranteed from few training data. 

Quantum and classical dynamical semigroups of superchannels and semicausal channels (Markus Hasenöhrl and Matthias C. Caro; Journal of Mathematical Physics 63, 072204 (2022); arXiv preprint 2021)

Quantum devices are subject to natural decay. We propose to study these decay processes as the Markovian evolution of quantum channels, which leads us to dynamical semigroups of superchannels. A superchannel is a linear map that maps quantum channels to quantum channels, while satisfying suitable consistency relations. If the input and output quantum channels act on the same space, then we can consider dynamical semigroups of superchannels. No useful constructive characterization of the generators of such semigroups is known.

We characterize these generators in two ways: First, we give an efficiently checkable criterion for whether a given map generates a dynamical semigroup of superchannels. Second, we identify a normal form for the generators of semigroups of quantum superchannels, analogous to the GKLS form in the case of quantum channels. To derive the normal form, we exploit the relation between superchannels and semicausal completely positive maps, reducing the problem to finding a normal form for the generators of semigroups of semicausal completely positive maps. We derive a normal for these generators using a novel technique, which applies also to infinite-dimensional systems.

Our work paves the way to a thorough investigation of semigroups of superchannels: Numerical studies become feasible because admissible generators can now be explicitly generated and checked. And analytic properties of the corresponding evolution equations are now accessible via our normal form. 

Encoding-dependent generalization bounds for parametrized quantum circuits (Matthias C. Caro, Elies Gil-Fuster, Johannes Jakob Meyer, Jens Eisert, and Ryan Sweke; Quantum 5, 582 (2021); arXiv preprint, 2021; see also the corresponding Poster and the Video Pitch)

A large body of recent work has begun to explore the potential of parametrized quantum circuits (PQCs) as machine learning models, within the framework of hybrid quantum-classical optimization. In particular, theoretical guarantees on the out-of-sample performance of such models, in terms of generalization bounds, have emerged. However, none of these generalization bounds depend explicitly on how the classical input data is encoded into the PQC. 

We derive generalization bounds for PQC-based models that depend explicitly on the strategy used for data-encoding. These imply bounds on the performance of trained PQC-based models on unseen data. Moreover, our results facilitate the selection of optimal data-encoding strategies via structural risk minimization, a mathematically rigorous framework for model selection. We obtain our generalization bounds by bounding the complexity of PQC-based models as measured by the Rademacher complexity and the metric entropy, two complexity measures from statistical learning theory. To achieve this, we rely on a representation of PQC-based models via trigonometric functions. Our generalization bounds emphasize the importance of well-considered data-encoding strategies for PQC-based models.

Machine learning researchers and practitioners steadily enlarge the multitude of successful learning models. They achieve this through in-depth theoretical analyses and experiential heuristics. However, there is no known general-purpose procedure for rigorously evaluating whether newly proposed models indeed successfully learn from data. 

We show that such a procedure cannot exist. For PAC binary classification, uniform and universal online learning, and exact learning through teacher-learner interactions, learnability is in general undecidable, both in the sense of independence of the axioms in a formal system and in the sense of uncomputability. Our proofs proceed via computable constructions of function classes that encode the consistency problem for formal systems and the halting problem for Turing machines into complexity measures that characterize learnability. Our work shows that undecidability appears in the theoretical foundations of machine learning: There is no one-size-fits-all algorithm for deciding whether a machine learning model can be successful. We cannot in general automatize the process of assessing new learning models. 

Necessary Criteria for Markovian Divisibility of Linear Maps (Matthias C. Caro and Benedikt R. Graswald; Journal of Mathematical Physics 62, 042203 (2021); arXiv preprint, 2020; see also the corresponding Poster and the Video Summary)

Characterizing those quantum channels that correspond to Markovian time evolutions is an open problem in quantum information theory, even different notions of quantum Markovianity exist. One such notion is that of infinitesimal Markovian divisibility for quantum channels introduced in arXiv:math-ph/0611057. Whereas there is a complete characterization for infinitesimal Markovian divisible qubit channels, no necessary or sufficient criteria are known for higher dimensions, except for necessity of non-negativity of the determinant.

We describe how to extend the notion of infinitesimal Markovian divsibility to general linear maps and closed and convex sets of generators. We give a general approach towards proving necessary criteria for (infinitesimal) Markovian divisibility that involve singular values of the linear map. With this approach, we prove two necessary criteria for infinitesimal divisibility of quantum channels that work in any finite dimension: an upper bound on the determinant in terms of a power of the smallest singular value, and in terms of a product of smallest singular values. Our criteria allow us to analytically construct, in any given dimension, a set of channels that contains provably non infinitesimal Markovian divisible ones.

We also discuss the classical counterpart of this scenario, i.e., stochastic matrices with the generators given by transition rate matrices. Here, we show that no necessary criteria for infinitesimal Markovian divisibility of the form proved for quantum channels can hold in general. However, we describe subsets of all transition rate matrices for which our reasoning can be applied to obtain necessary conditions for Markovian divisibility.

Binary Classification with Classical Instances and Quantum Labels (Quantum Machine Intelligence 3, 18 (2021); arXiv preprint, 2020; see also the corresponding Poster and the Video Summary)

In classical statistical learning theory, one of the most well studied problems is that of binary classification. The information-theoretic sample complexity of this task is tightly characterized by the Vapnik-Chervonenkis (VC) dimension. A quantum analog of this task, with training data given as a quantum state has also been intensely studied and is now known to have the same sample complexity as its classical counterpart.

We propose a novel quantum version of the classical binary classification task by considering maps with classical input and quantum output and corresponding classical-quantum training data. We discuss learning strategies for the agnostic and for the realizable case and study their performance to obtain sample complexity upper bounds. Moreover, we provide sample complexity lower bounds which show that our upper bounds are essentially tight for pure output states. In particular, we see that the sample complexity is the same as in the classical binary classification task w.r.t. its dependence on accuracy, confidence and the VC-dimension. 

Pseudo-dimension of quantum circuits (Matthias C. Caro and Ishaun Datta; Quantum Machine Intelligence 2, 14 (2020); arXiv preprint, 2020; see also the corresponding Poster and the Video Summary)

We characterize the expressive power of quantum circuits with the pseudo-dimension, a measure of complexity for probabilistic concept classes. We prove pseudo-dimension bounds on the output probability distributions of quantum circuits; the upper bounds are polynomial in circuit depth and number of gates. Using these bounds, we exhibit a class of circuit output states out of which at least one has exponential gate complexity of state preparation, and moreover demonstrate that quantum circuits of known polynomial size and depth are PAC-learnable.

Quantum learning Boolean linear functions w.r.t. product distributions (Quantum Inf Process 19, 172 (2020); arXiv preprint, 2019; see also the corresponding Poster and the Video Summary)

The problem of learning Boolean linear functions from quantum examples w.r.t. the uniform distribution can be solved on a quantum computer using the Bernstein-Vazirani algorithm. A similar strategy can be applied in the case of noisy quantum training data, as was observed in arXiv:1702.08255v2. However, extensions of these learning algorithms beyond the uniform distribution have not yet been studied. We employ the biased quantum Fourier transform introduced in arXiv:1802.05690v2 to develop efficient quantum algorithms for learning Boolean linear functions on n bits from quantum examples w.r.t. a biased product distribution. 

Our first procedure is applicable to any (except full) bias and requires O(ln(n)) many quantum examples. The number of quantum examples used by our second algorithm is independent of n, but the strategy is applicable only for small bias. Moreover, we show that the second procedure is stable w.r.t. noisy training data and w.r.t. faulty quantum gates. This also enables us to solve a version of the learning problem in which the underlying distribution is not known in advance. 

Finally, we prove lower bounds on the classical and quantum sample complexities of the learning problem. Whereas classically Ω(n) examples are necessary independently of the bias, we are able to establish a quantum sample complexity lower bound of Ω(ln(n)) only under an assumption of large bias. Nevertheless, this allows for a discussion of the performance of our suggested learning algorithms w.r.t. sample complexity. With our analysis we contribute to a more quantitative understanding of the power and limitations of quantum training data for learning classical functions. 

My talks

Where you might have seen my work or myself

I have attended the following conferences, summer schools, and workshops and presented my work at some of them: