Positive random walks and positive-semidefinite random matrices
Speaker: Joel Tropp, California Institute of Technology (Wednesday, June 17, 4:15-5:30 PM)
On the real line, a random walk that can only move in the positive direction is very unlikely to remain close to its origin. After a fixed number of steps, the left tail has a Gaussian profile under minimal assumptions. Remarkably, a similar phenomenon occurs when we consider a positive random walk on the cone of positive-semidefinite matrices. After a fixed number of steps, the minimum eigenvalue is described by a Gaussian random matrix model.
This talk introduces a new way to make this intuition rigorous. The methodology addresses an open problem in computational mathematics about sparse random embeddings. The presentation targets a general mathematical audience.
Preprint: https://arxiv.org/abs/2501.16578
Randomized Trace Estimation: Algorithms, Applications, and Quantum Perspectives
Speaker: Haim Avron, Tel Aviv University (Monday, June 15, 09:15-10:00 AM)
Randomized trace estimation is by now a standard and highly useful primitive in large-scale numerical linear algebra, with applications ranging from matrix function approximation to modern machine learning and scientific computing. Early work in the area established nonasymptotic guarantees for Monte Carlo estimators in the implicit-matrix setting, clarified the effect of the probing distribution on sample complexity, and helped launch a broad line of subsequent research on improved estimators.
This talk connects that established line of work to more recent directions motivated by optimization and quantum computation. On the optimization side, trace and spectral-sum estimators lead to scalable methods for quantities such as log-determinants and related objectives. On the quantum side, the same basic problem reappears under new computational models, including restricted access patterns and quantum representations of matrices, leading both to new algorithmic opportunities and to new limitations. Together, these developments show how randomized trace estimation continues to evolve as a rich theme within RandNLA.
Preconditioned log-determinant estimation: when is one sample enough?
Speaker: Alice Cortinovis, University of Pisa (Monday, June 15, 10:30-11:15 AM)
We consider randomized algorithms for estimating the log-determinant of (regularized) symmetric positive semi-definite matrices. The matrix is only accessed through matrix vector products, and the techniques are based on the introduction of a preconditioner and stochastic trace estimation. We claim that preconditioning as much as we can and making a rough estimate of the residual part with a small budget achieves a small error in most of the cases. We choose a Nyström preconditioner and estimate the residual using only one sample of stochastic Lanczos quadrature. We analyze the performance of this strategy from a theoretical and practical viewpoint. We also present an algorithm that, at almost no additional cost, detects whether the proposed strategy is not the most effective, in which case it uses more samples for the SLQ part. Numerical examples on several test matrices show that our proposed methods are competitive with existing algorithms.
Randomized orthogonalization for Krylov subspace methods
Speaker: Igor Simunec, EPFL (Monday, June 15, 11:15-12:00 PM)
Randomized sketching has recently emerged as an effective tool for reducing the computational cost of Krylov subspace methods for a wide range of linear algebra problems, including linear systems, eigenvalue problems, and matrix functions. The central component of this approach is the randomized Arnoldi process, which uses randomized subspace embeddings to construct a sketch-orthonormal Krylov basis at reduced computational and communication costs. Approximate solutions to the original problem (e.g., a linear system) are then obtained through an oblique (sketch-orthogonal) projection onto the Krylov subspace. In the first part of the talk, we review randomized orthogonalization techniques and two variants of the randomized Krylov framework, discussing their advantages and limitations. In the second part, we examine the impact of oblique projections on convergence and show that they may lead to delays or irregularities in the convergence, especially for more challenging problems, undermining the robustness of randomized Krylov methods. To address this issue, we introduce a modification of the randomized Arnoldi process that recovers the same approximations obtained with the deterministic Arnoldi process, without compromising on computational efficiency. Numerical experiments demonstrate that the resulting approach combines the speed of randomized methods with the robustness of deterministic Arnoldi.
Randomized Krylov methods for inverse problems
Speaker: Juilanne Chung, Emory University (Monday, June 15, 1:30-2:15 PM)
In this talk, I will describe randomized Krylov subspace methods for efficiently computing regularized solutions to large-scale linear inverse problems. Building on the recently developed randomized Gram-Schmidt process, where sketched inner products are used to estimate inner products of high-dimensional vectors, a randomized Golub-Kahan algorithm is developed, which also works for general rectangular matrices. New iterative solvers based on the randomized Golub-Kahan factorization are proposed, which extend the capabilities of randomized GMRES. Hybrid projection methods are also considered: these combine iterative projection methods, based on both the randomized Arnoldi and randomized Golub-Kahan factorizations, with Tikhonov regularization, where regularization parameters can be selected automatically during the iterative process. Numerical results from image deblurring and seismic tomography show the potential benefits of these approaches.
On regularization via early stopping for least squares regression
Speaker: Elizaveta Rebrova, Princeton University (Monday, June 15, 2:15-3:00 PM)
I will present recent joint work with J. Lok and R. Sonthalia with the fine-grained analysis of the early stopping as a regularization mechanism for overparameterized least-squares problems. For the general step-size schedules, we show that the iterate at any finite stopping time is exactly the minimum-norm solution of a generalized ridge-regularized least-squares problem. This gives a precise equivalence between early stopping and an explicit regularization procedure, and clarifies how the induced regularization depends on the spectrum of the data matrix and on the step-size schedule. I will also discuss consequences for the early-stopped risk and stopping-time selection, and how this viewpoint is connected with classical semiconvergence ideas from numerical linear algebra and inverse problems.
Everything is Vecchia: Unifying low-rank and sparse inverse Cholesky approximations
Speaker: Robert Webber, University of California San Diego (Monday, June 15, 3:30-4:15 PM)
The partial pivoted Cholesky approximation accurately represents matrices that are close to being low-rank. Meanwhile, the Vecchia approximation accurately represents matrices with inverse Cholesky factors that are close to being sparse. What happens if a partial Cholesky approximation is combined with a Vecchia approximation of the residual? Our recent work shows how the sum is exactly a Vecchia approximation of the original matrix with an augmented sparsity pattern. Thus, the Vecchia approximation subsumes a class of existing matrix approximations and has broad applicability.
Randomised sketching for tensor train rounding
Speaker: Mi-Song Dupuy, LJLL, Sorbonne Université (Tuesday, June 16, 9:00-9:45 AM)
Randomised sketching techniques enable efficient dimensionality reduction while preserving essential data properties, yet their application to high-order tensors remains challenging. In this talk, we introduce TTStack, a structured random projection designed specifically for the Tensor Train (TT) format. By tuning two parameters, $P$ and $R$, TTStack unifies and generalises existing TT-adapted sketches, interpolating between the Khatri–Rao sketch ($R = 1$) and the Gaussian TT sketch ($P = 1$). We prove that TTStack satisfies both the oblivious subspace embedding (OSE) and oblivious subspace injection (OSI) properties, with theoretical guarantees scaling linearly in tensor order $d$ and subspace dimension $r$—a significant improvement over prior methods, which suffer from exponential scaling. These results yield quasi-optimal error bounds for QB factorisation and randomised TT rounding. The talk will showcase numerical experiments on synthetic tensors, Hadamard products, and a quantum chemistry application, demonstrating TTStack’s potential for scalable and efficient high-dimensional tensor processing.
Exponential-scale linear algebra via tensor network dimensionality reduction
Speaker: Chris Camaño, California Institute of Technology (Tuesday, June 16, 9:45-10:30 AM)
Tensor network methods have become a leading technology for computations with exponentially large data. The compression achieved by these data-structures makes it possible to store and manipulate operators that are far beyond the reach of traditional dense or sparse RandNLA methods. In many settings, explicitly storing even a single operator at this scale is infeasible.
In this talk, we describe how several familiar RandNLA algorithms can be adapted to the tensor network setting. We propose implementing methods such as Nyström approximation and variancereduced trace estimators with structured sketching matrices whose columns are random matrix product states (rMPSs) populated with independent Gaussian entries. Our methods are efficient, theoretically sound, and numerically stable. We will demonstrate their effectiveness on several problems from quantum physics, state our key theoretical guarantees, and outline the mathematical ideas underlying the analysis. No prior knowledge of tensor networks will be assumed.
Transpose-free linear algebra
Speaker: Diana Halikias, Courant Institute, New York University (Tuesday, June 16, 11:00-11:45 AM)
We study the limitations of matrix-free algorithms that access a matrix $A$ only through forward matrix-vector products (matvecs) $x \mapsto Ax$, without access to the transpose $A^\top$ or its action. This setting arises naturally in operator learning, inverse problems, and matrix-free PDE solvers, where adjoint evaluations may be unavailable or prohibitively expensive. We show that the lack of transpose access creates severe and sometimes insurmountable barriers. For Krylov methods, we prove that the sequence of projected operator norms produced by Arnoldi iteration can follow any prescribed nondecreasing curve, showing that forward matvecs alone provide essentially no reliable information about the spectral norm. For several core problems, including low-rank approximation, least-squares, and column subset selection, we show that distinct matrices can generate identical matvecs while having fundamentally different solutions. Finally, we prove quantitative lower bounds on the number of forward (possibly adaptive, randomized) matvecs required for approximation tasks.
The matrix-vector complexity of Ax=b
Speaker: Raphael Meyer, UC Berkeley (Tuesday, June 16, 11:45-12:30 PM)
Solving linear systems $Ax=b$ is a fundamental pillar of NLA. For over 60 years, iterative methods that access $A$ only through matrix-vector products have been the standard approach for solving large linear systems. While lower bounds exist for many special cases, prior work has not shown that methods like GMRES and MINRES achieve an asymptotically optimal matrix-vector complexity for approximately solving $Ax=b$. In this talk, we prove that MINRES (and CG on the normal equations) achieves an optimal complexity (up to a factor of 4) over all randomized matrix-vector methods, and GMRES achieves the best possible over all "transpose-free" methods. Time permitting, we will also look at extensions to the $f(A)b$ and $tr(f(A))$ problems.
Stochastic Rounding
Speaker: Petros Drineas, Purdue University (Tuesday, June 16, 2:00-2:45 PM)
Motivated by the resurgence of stochastic rounding, we consider stochastic nearness rounding of tall and thin real matrices. We provide theoretical and empirical evidence showing that, with high probability, the smallest singular value of a stochastically rounded matrix is bounded away from zero -- regardless of how close the original matrix was to being rank-deficient and even if it were rank-deficient. In other words, stochastic rounding implicitly regularizes tall-and-thin matrices so that the rounded version has full rank. We will briefly discuss the implications of such results for solving regression problems.
Blind Error Estimation for CUR Approximations
Speaker: Katherine Pearce, Oden Institute, UT Austin (Wednesday, June 17, 9:00-9:45 AM)
CUR approximations are low-rank factorizations that use so-called “natural bases” for the row and column space of a matrix, consisting of subsets of its actual rows and columns. As a result, the CUR approximation preserves many important matrix properties, such as sparsity, non-negativity, and interpretability in the context of the original data. In this talk, we propose a "blind" error estimation technique that does not require additional access to the input matrix, which is crucial for applications in which the matrix is only accessible via matrix-vector products, too large to be stored in main memory, or streaming data. We show that our estimator is an unbiased estimator of the squared Frobenius error and requires only a few extra column accesses. We also demonstrate that its variance can be explicitly characterized in terms of the extra column selection strategy and the quality of the CUR approximation, and we present numerical results for both adversarial synthetic matrices and real-world empirical datasets.
Randomization strategies for low-rank matrix equation solvers
Speaker: Valeria Simoncini, Universita' di Bologna (Wednesday, June 17, 9:45-10:30 AM)
Linear matrix equations $A_1 X B_1 + \ldots + A_k X B_k = C$ arise in a largely increasing number of applications, as effective alternatives to vector linear systems. A significant feature of matrix-oriented formulations is that properties of the original problem, such as symmetry and low-rank structure, are maintained and can possibly be exploited.
Under certain conditions on the data, numerical low rank is inherited by the solution matrix $X$, so that looking for a low-rank approximation is sensible. To this end, matrix short recurrences have been recently introduced, that maintain low-rank iterates - in factored form - during the whole approximation process.
As the iteration proceeds, the rank of the iterates may quickly increase, so that low rank needs to be enforced with possible accuracy loss. Randomization methods come to rescue, allowing the recurrence to proceed seamlessly towards the sought after approximation, with high probability.
In this talk we present examples where randomization strategies are successfully employed as a key ingredient of efficient short recurrences for linear matrix equations.
Talk based on collaborations with Martina Iannacito (while at UniBO), Davide Palitta (UniBO), and Catherine Powell (University of Manchester).
Randomized Rayleigh--Ritz procedure
Presenter: Nian Shao, EPFL (Wednesday, June 17, 11:00-11:45 AM)
Extracting approximate eigenpairs from a prescribed subspace is of fundamental importance in eigenvalue computation. While projecting the target eigenvector onto the subspace yields satisfactory accuracy, extracting an approximate eigenpair that attains a comparable convergence rate has remained a long-standing open problem. Although the standard Rayleigh--Ritz procedure is widely used for this purpose, it may suffer from deteriorated convergence of Ritz values and may even fail to produce convergent Ritz vectors. In this talk, we address this long-standing open problem by introducing a randomized Rayleigh--Ritz procedure whose output converges at a rate similar to the ideal projection. Our analysis requires only the simplicity of the target eigenvalue and extends naturally to nonlinear eigenvalue problems.
Reducing acquisition time and radiation damage: data-driven subsampling for spectro-microscopy
Speaker: Hussam Al Daas, STFC, Rutherford Appleton Laboratory (Wednesday, June 17, 11:45-12:30 PM)
Spectro-microscopy is an experimental technique which can be used to observe spatial variations in chemical state and changes in chemical state over time or under experimental conditions. As a result it has broad applications across areas such as energy materials, catalysis, environmental science and biological samples. However, the technique is often limited by factors such as long acquisition times and radiation damage. We present two measurement strategies that allow for significantly shorter experiment times and total doses applied. The strategies are based on taking only a small subset of all the measurements (e.g. sparse acquisition or subsampling), and then computationally reconstructing all unobserved measurements using mathematical techniques. The methods are data-driven, using spectral and spatial importance subsampling distributions to identify important measurements. As a result, taking as little as 4-6\% of the measurements is sufficient to capture the same information as in a conventional scan.
Fast Rank-Adaptive CUR and Its Extension to Kernel Matrices
Speaker: Gunnar Martinsson, University of Texas at Austin (Wednesday, June 17, 2:00-2:45 PM)
The talk begins with a description of ""IterativeCUR"", a rank-adaptive CUR algorithm built around a single small sketch of the residual that is successively downdated and recycled. The discussion starts from the familiar randomized-SVD pattern and then reviews the ingredients that lead to IterativeCUR: a posteriori error estimation, adaptive rank determination, recycling of a small sketch, and the use of CUR and interpolatory decompositions. The resulting method computes accurate CUR factorizations without prior knowledge of the numerical rank, interacts with the matrix only through a small sketch and selected rows and columns, and has proved highly competitive in practice.
The second part of the talk describes how these ideas can be extended to kernel matrices with entries A(i,j)=k(x_i,y_j), in situations where neither the full matrix nor matrix-vector products are available. The key idea is to replace the global matrix sketch by a surrogate model constructed from a small set of sampling points in the target region. This surrogate can then be used to drive an adaptive CUR procedure in much the same spirit as IterativeCUR. The talk will outline both a randomized variant, which uses the surrogate to initialize a thin recycled sketch, and deterministic variants based on surrogate compression. The emphasis will be on the algorithmic framework, the expected complexity, and the role of analytic structure in making reliable kernel CUR feasible.
Quasi-optimal hierarchically semi-separable matrix approximation
Speaker: David Persson, New York University & Flatiron Institute (Wednesday, June 17, 2:45-3:30 PM)
We present a randomized algorithm for producing a quasi-optimal hierarchically semi-separable (HSS) approximation to an $N\times N$ matrix $A$ using only matrix-vector products with $A$ and $A^T$. We prove that, using $O(k \log(N/k))$ matrix-vector products and $O(N k^2 \log(N/k))$ additional runtime, the algorithm returns an HSS matrix $B$ with rank-$k$ blocks whose expected Frobenius norm error $\mathbb{E}[\|A - B\|_F^2]$ is at most $O(\log(N/k))$ times worse than the best possible approximation error by an HSS rank-$k$ matrix. In fact, the algorithm we analyze in a simple modification of an empirically effective method proposed by [Levitt \& Martinsson, SISC 2024]. As a stepping stone towards our main result, we prove two results that are of independent interest: a similar guarantee for a variant of the algorithm which accesses $A$'s entries directly, and explicit error bounds for near-optimal subspace approximation using \textit{projection-cost-preserving sketches}. To the best of our knowledge, our analysis constitutes the first polynomial-time quasi-optimality result for HSS matrix approximation, both in the explicit access model and the matrix-vector product query model.
RandLinearAlgebra.jl: A cyberlaboratory for RandNLA
Speaker: Nathaniel Pritchard, University of Oxford (Thursday, June 18, 9:00-9:45 AM)
Despite extensive theoretical work on the performance of algorithms in Randomized Numerical Linear Algebra (RandNLA), significant gaps remain between worst-case theoretical guarantees and observed performance on practical problems. These gaps create a need for software platforms that enable rigorous experimentation and comparison across algorithms, data regimes, and implementation choices. This poster presents RandLinearAlgebra.jl, a Julia library with a modular and extensible design intended to support such experimentation. By making it easier to implement, test, and compare RandNLA methods, RandLinearAlgebra.jl provides a practical tool for researchers and practitioners seeking to understand and apply randomized linear algebra techniques.
Interpolatory Dynamical Low-Rank Approximations
Speaker: Bart Vandereycken, University of Geneva (Thursday, June 18, 9:45-10:30 AM)
Dynamical low-rank approximation (DLRA) reduces the cost of solving large-scale matrix differential equations by evolving the solution on a low-rank manifold. For nonlinear problems, however, evaluating the full velocity field can still be expensive. We propose DLRA-DEIM, which replaces orthogonal tangent-space projections by oblique, data-sparse projections selected through the discrete empirical interpolation method. The resulting continuous-time model is formulated as a well-posed Filippov differential inclusion, accounting for discontinuities in the selected interpolation indices.
We establish existence, exactness, and error bounds analogous to classical DLRA, and give a convex-polytope characterization for the QDEIM case. Based on this framework, we introduce PRK-DEIM, a family of projected Runge--Kutta integrators that retains the convergence order of standard projected Runge--Kutta methods while substantially reducing computational cost.
Joint work with Benjamin Carrel, Daniel Kressner, and Hei Yin Lam.
Randomizing truncated CG to escape saddle points in high dimension
Speaker: Nicolas Boumal, EPFL (Thursday, June 18, 11:00-11:45 PM)
Consider minimizing a nonconvex function f : R^d -> R. To escape its saddle points, a deterministic algorithm needs at least Omega(d) "pieces of information" about f (think: gradients or Hessian-vector products). In stark contrast, randomized algorithms can bring this complexity down to poly-log(d). On the flip side, those randomized algorithms "bob around": they do not converge. We propose an algorithm which is (aptly) able to amplify the randomization when we are near a saddle, but which is also able to "absorb" the randomization when we are near a minimizer (in order to converge there). The method is based on conjugate gradients and trust regions: https://arxiv.org/abs/2603.15494.
Towards Universal Convergence of Backward Error in Linear System Solvers
Speaker: Michal Derezinski, University of Michigan (Thursday, June 18, 11:45-12:30 PM)
The quest for an algorithm that solves an $n\times n$ linear system in $O(n^2)$ time complexity, or $O(n^2 \poly(1/\epsilon))$ when solving up to $\epsilon$ relative error, is a longstanding open problem in numerical linear algebra and theoretical computer science. There are two predominant paradigms for measuring relative error: forward error (i.e., distance from the output to the optimum solution) and backward error (i.e., distance to the nearest problem solved by the output). In most prior studies, convergence of iterative linear system solvers is measured via various notions of forward error, and as a result, depends heavily on the conditioning of the input. Yet, the numerical analysis literature has long advocated for backward error as the more practically relevant notion of approximation. In this talk, I will show that --- surprisingly --- the classical and simple Richardson iteration incurs at most $1/k$ (relative) backward error after $k$ iterations on any positive semidefinite (PSD) linear system, irrespective of its condition number. This universal convergence rate implies an $O(n^2/\epsilon)$ complexity algorithm for solving a PSD linear system to $\epsilon$ backward error, and we establish similar or better complexity when using a variety of Krylov solvers beyond Richardson. Then, by directly minimizing backward error over a Krylov subspace, we attain an even faster $O(1/k^2)$ universal rate, and we turn this into an efficient algorithm, MINBERR, with complexity $O(n^2/\sqrt\epsilon)$. We extend this approach via normal equations and random Gaussian perturbations to solving general linear systems, for which we empirically observe $O(1/k)$ convergence. We report strong numerical performance of our algorithms on benchmark problems. Joint work with Yuji Nakatsukasa and Elizaveta Rebrova.
Local weak limit of minimum spanning arborescence of complete graphs
Presenter: Swarnadeep Bagchi, University of Victoria (Monday, June 15, 3:45-5:00 PM)
We can sample minimun spanning arborescence (MSA) in complete graph using CLEB algorithm. Arborescence is a directed version of spanning tree. We first take i.i.d. Exponential(1) weights on the edges of complete graph. Then we use CLEB algorithm to sample MSA in complete graphs. We will show how can we find local weak limit of MSA of complete graphs as the size of complete graphs goes to infinity.
Parallel approximate Cholesky for Laplacian Systems via Independent Sets
Presenter: Yves Baumann, ETH Zurich (Monday, June 15, 3:45-5:00 PM)
In a breakthrough result, Spielman and Teng (2004) developed a nearly linear time Laplacian system solver. Since then, the algorithms to solve Laplacian linear systems have been continuously refined to be parallel and practical. Gao, Kyng and Spielman (2023) implemented a single threaded version of the algorithm "Approximate Cholesky" which was then parallelized by Baumann and Kyng (2024). We implement and benchmark the proposed parallel Approximate Cholesky solver in C++ and compare it to existing baselines on a variety of problems. We study trade-offs between the chosen independent set subroutines and the parallelization achieved of the factorization. To make our results reproducible, we open-source the solver and package it for various programming languages.
Stability and Conditioning of Pivotless LU
Presenter: Andrea Di Antonio, EPFL/PSI (Monday, June 15, 3:45-5:00 PM)
Gaussian elimination without pivoting is fast and preserves sparsity and Hermitian structure, but it can be unstable on indefinite or ill-conditioned matrices. For dense matrices, the smoothed analysis of Sankar, Spielman, and Teng fixes this by perturbing every entry independently, but that destroys both sparsity and structure. We instead recover stability with far less randomness: for a Hermitian matrix we study both a random scalar real shift and a deterministic scalar imaginary shift on the diagonal. For the real perturbation we prove high-probability bounds on the growth factor, condition number, and preconditioning quality. For the imaginary shift, a simple spectral identity yields explicit design rules that guarantee both a stable factorization and a good preconditioner.
Efficient QR-based Column Subset Selection through Randomized Sparse Embeddings
Presenter: Israa Fakih, EPFL/PSI (Monday, June 15, 3:45-5:00 PM)
In this paper, we introduce an efficient algorithm for column subset selection that combines the column-pivoted QR factorization with sparse subspace embeddings. The proposed method, SE-QRSC, is particularly effective for wide matrices with significantly more columns than rows. Starting from a matrix A, the algorithm selects k columns from the sketched matrix B=AΩ^{T}, where Ω is a sparse subspace embedding of range(A^{T}). The sparsity structure of Ω is then exploited to map the selected pivots back to the corresponding columns of A, which are then used to produce the final subset of selected columns. We prove that this procedure yields a factorization with strong rank-revealing properties, thus revealing the spectrum of A. The resulting bounds exhibit a reduced dependence on the number of columns of A compared to those obtained from the strong rank-revealing QR factorization of A. For general matrices, the algorithm can be extended by first applying an additional subspace embedding of range(A).
A mixed precision algorithm for matrix root functions
Presenter: Bowen Gao, Fudan University (Monday, June 15, 3:45-5:00 PM)
Mixed precision computation has gained significant traction in recent years driven by the rapid evolution of machine learning and specialized hardware infrastructure. Recent advancements on mixed precision algorithms have substantially enhanced the performance of various linear algebra solvers. In this work, we propose a mixed precision algorithm for computing matrix root functions, primarily the matrix square root function. We introduce a nonlinear iterative refinement framework designed to refine a lower precision approximation to the working precision level. Mathematical analysis demonstrates that our carefully designed mixed precision algorithms compute the matrix square root function to full working precision accuracy. Numerical experiments further indicate that they offer speedup across most scenarios compared to the fixed precision method on the x86-64 architecture.
Sharp error bounds for approximate eigenvalues and singular values from subspace methods
Presenter: Irina-Beatrice Haas, University of Oxford (Monday, June 15, 3:45-5:00 PM)
Subspace methods are commonly used for finding approximate eigenvalues and singular values of large-scale matrices. Once a subspace is found, the Rayleigh-Ritz method (for symmetric eigenvalue problems) and Petrov-Galerkin projection (for singular values) are the de facto method for extraction of eigenvalues and singular values. In this work we derive quadratic error bounds for approximate eigenvalues of symmetric matrices and for singular values of arbitrary matrices obtained from the above methods.
Large scale Quantum Process Tomography
Presenter: Dimitri Lanier, EPFL/PSI (Monday, June 15, 3:45-5:00 PM)
I ll present a project that consist in learning a global MPO representation based on noisy estimation of the partial trace, allowing to perform quantum channel Tomography
Distributed sketching on GPUs
Presenter: Johannes Laute, PSI (Monday, June 15, 3:45-5:00 PM)
Sketching is a fundamental step in randomized linear algebra. There is a plethora of different sketching algorithms with different trade-offs in embedding quality, computational and memory complexity and parallelizability to choose from. We implement different algorithms with a strong focus on efficiency, scalability and GPU implementations, and expose them via Python and Julia bindings. We perform extensive evaluations in a distributed setting to analyse the communication behaviour and scalability of the different algorithms.
Randomized Streaming Tensor Train Decomposition
Presenter: Mariana Martínez Aguilar, EPFL (Monday, June 15, 3:45-5:00 PM)
We consider the problem of incrementally updating a Tensor Train (TT) decomposition as new data arrives in a streaming setting, without reforming the full tensor. Given an existing TT factorization and a new incoming tensor, we propose two randomized algorithms to compute a TT approximation of the accumulation tensor: a fixed-rank algorithm and an adaptive one. Both methods build on a randomized range finder strategy combined with Khatri-Rao product (KRP) structured sketching, enabling efficient computation of approximate bases for the updated tensor. A key advantage over existing methods such as TT-ICE is that our algorithms can not only expand but also change the basis, allowing for lower TT ranks and better approximations. The adaptive algorithm automatically determines the output TT ranks to meet a user-prescribed relative accuracy, using a residual sketch as a stopping criterion. We introduce a new partial contractions algorithm for sketching tensors not in TT format, building block of both methods. We demonstrate the effectiveness of our algorithms on applications including small angle scattering data, streaming dynamic mode decomposition, and calcium imaging.
A medley of fun techniques for solving interesting problems in RandNLA
Presenter: Fabio Matti, EPFL (Monday, June 15, 3:45-5:00 PM)
We consider three bite-sized problems arising in the design and analysis of algorithms in randomized numerical linear algebra. In particular, we present a simple algorithm for positivity-preserving Chebyshev interpolation, a surprising proof technique for bounding spectral norm moments of Gaussian random vectors, and a connection of the error of conjugate gradient iterates to the error of a certain quadratic form approximation. For each problem, we briefly explain its context and discuss its key properties.
Matrix Chaos Inequalities
Presenter: Petar Nizic-Nikolac, ETH Zurich (Monday, June 15, 3:45-5:00 PM)
Matrix concentration inequalities and their recently discovered sharp counterparts provide powerful tools to bound the spectrum of random matrices whose entries are linear functions of independent random variables. However, in many applications in theoretical computer science and in other areas one encounters more general random matrix models, called matrix chaoses, whose entries are polynomials of independent random variables. This poster outlines how to develop general matrix concentration inequalities for matrix chaoses, allowing these models to be treated in a systematic and unified way.
Joint work with Afonso S. Bandeira, Kevin Lucca, and Ramon van Handel.
Random sketching of operators with application to learning preconditioners
Presenter: Alexandre Pasco, EPFL (Monday, June 15, 3:45-5:00 PM)
We propose a new random sketching approach for embedding high-dimensional linear operators in Hilbert-Schmidt norm, using random input-output pairs. Such operator can then be approximated in a low-dimensional subspace of operators by solving a small least-squares problem. To achieve computational efficiency, we introduce a structured random map, composed of three random matrices. We provide rigorous conditions under which subspaces of operators are accurately embedded with high probability, requiring less input-output pairs than embedding based on Khatri-Rao products [1]. The framework is flexible, as the random matrices may be adapted to the operator structure and the computational environment. As an application, we consider the construction of preconditioners for high-dimensional linear equations, generalizing and improving the approach from [2]. We derive a rigorous characterization of preconditioner quality through the discrepancy between the preconditioned operator and an optimal baseline, which can be tailored to a linear approximation space for the solution. We show that this quantity can be efficiently minimized within the proposed framework, especially for parameter separable linear equations. We then establish rigorous high-probability bounds on the quasi-optimality error of the preconditioned Galerkin projection and on the accuracy of a preconditioned residual-based error estimator when the sketch dimensions are sufficiently large. Numerical experiments on an acoustic wave scattering benchmark demonstrate the effectiveness of the method.
Investigating Efficient Leverage Score Sampling Methods for the Alternating Least Squares Optimization of the Canonical Polyadic Decomposition
Presenter: Karl Pierce, University of Maryland (Monday, June 15, 3:45-5:00 PM)
The CANDECOMP/PARAFAC (CP) decomposition is a powerful tool used to break the “curse of dimensionality” associated with higher-order tensors. The decomposition is typically computed via a standard alternating least squares (CP-ALS) algorithm where one must iteratively solve a set of overdetermined least squares problems. Unfortunately, the CPD-ALS itself suffers from the curse of dimensionality and quickly become expensive in time and memory. Because of the overdetermined nature of the CPD-ALS, the decomposition is well suited for randomized linear algebra-based acceleration.
Though a number of randomized CPD-ALS strategies exist [1-4], we are particularly interested in leverage score sampling-based methods because of their operational efficiency and structure preserving-properties.
Although sampling is asymptotically inexpensive, the operation is memory-bound by the bandwidth and latency of a computer system.
This fact makes it necessary to discover efficient sampling strategies to minimize memory movement and communication and to maximize optimization efficiency of the CPD-ALS.
In this work, we discuss a number of strategies to improve the efficiency of leverage score sampling for the CPD-ALS including block-wise sampling, introduction of the interpolative decomposition, and the low-rank approximation of leverage scores.
We compare these strategies using a number of synthetic and real tensors which come from physics and data science.
[1] C. Battaglino, G. Ballard, and T. G. Kolda, A practical randomized CP tensor decomposition, SIAM Journal on Matrix Analysis and Applications, 39 (2018), pp. 876–901, doi:10.1137/17M1112303, arXiv:1701.06600.\newline
[2] B. W. Larsen and T. G. Kolda, Practical leverage-based sampling for low-rank tensor decomposition, SIAM Journal on Matrix Analysis and Applications 43, 1488 (2022).\newline
[3] G. Zhou, A. Cichocki, and S. Xie, Decomposition of big tensors with low multilinear rank, Dec. 2014, arXiv:1412.1885.\newline
[4] Y. Wang, H.-Y. Tung, A. J. Smola, and A. Anandkumar, Fast and guaranteed tensor decomposition via sketching, in Advances in Neural Information Processing Systems, 2015, pp. 991–999, http://papers.nips.cc/paper/5944-fast-and-guaranteed-tensor-decomposition-via-sketching.pdf.
On the Local Convergence of a Nonlinear Jacobi Iteration in the Tensor-Train Format
Presenter: Guifré Sánchez Serra, EPFL (Monday, June 15, 3:45-5:00 PM)
In recent years, the tensor train (TT) format has become a powerful tool for solving large-scale linear systems and eigenvalue problems, particularly in quantum physics, where states and operators are naturally represented as tensors. A common strategy is to formulate these problems as optimization tasks over the TT cores. Sequential optimization of the cores—known as alternating optimization—forms the basis of algorithms such as TT-ALS and DMRG. While the local convergence of these methods was studied in the 2010s by A. Uschmajew, T. Rohwedder, and others, existing results are largely qualitative. In this work, we present new convergence results for a relaxed Jacobi variant of the mentioned alternating scheme. More specifically, we prove local linear convergence under reasonable assumptions and derive a quantitative estimate of the convergence rate. We complement the theoretical analysis with numerical experiments that support our findings.
Incremental Column Subset Selection via Conditional Determinantal Point Processes
Presenter: Zhipeng Xue, EPFL (Monday, June 15, 3:45-5:00 PM)
We propose a perspective based on determinantal point processes(DPP) to consider many existing algorithms for column subset selection problem(CSSP). We show that adaptive randomized pivoting(ARP), the recent algorithm proposed by Cortinovis and Kressner for CSSP, can be viewed as implicit fixed-size DPP(or say, volume sampling), applied to an approximation of the input matrix of the CSSP. We formalize the joint probability of randomly pivoted Cholesky and show that it is a trace-penalized DPP. In addition, we propose using the conditional DPP as a remedy of many algorithms for CSSP, keeping previous selection while making the selection more diverse. In particular, we combine conditional DPP and ARP and get a new algorithm, called multi-stage ARP. We prove that multi-stage ARP inherits the approximation guarantees of ARP. The resulting analysis reveals a connection between global DPP-based sampling strategies and some greedy algorithms.