Research

On this page I illustrate my recent research in optimization and sampling methods.

joint work with Chris J. Maddison, Yee Whye Teh, Brendan O'Donoghue, Arnaud Doucet from Oxford and DeepMind.

In this paper, we generalise the momentum method and extend the class of convex functions that can be optimised with linear rates using just gradients. The key feature of this method is the use of generalised kinetic energies, which allow for a great flexibility and hence they can drastically improve convergence speed for functions that are not strongly convex or whose gradients are not Lipschitz. The figure below illustrates this speedup for a non-strongly convex function (the plots are in logarithmic scale).

In recent years, papers by Su, Boyd and Candès and by Wibisono, Wilson, and Jordan have studied the continuous time limit of optimization algorithms. As the step size tends to zero, they have obtained an ODE limit, which is often much easier to analyse than its discrete counterpart. Once behaviour in continuous time is well understood, they study various discretizations of these ODEs, some of them leading to new optimization methods.

In our work, we have followed this approach by first studying a class of ODEs called conformal Hamiltonian systems. These systems consists of standard Hamiltonian dynamics with general kinetic energy, plus a dissipative friction term. Due to this friction term, the total energy of the system is decreasing, and hence these systems are useful for optimization.

Using ideas from convex analysis, we develop a complete understanding of these systems in continuous time, and show that their linear convergence is preserved for suitably chosen discretizations. This leads to a new class of optimization methods that have linear convergence for a much larger class of convex functions than existing first order methods.

joint work with George Deligiannidis, Alexandre Bouchard-Côté and Arnaud Doucet from Oxford and UBC.

In this paper, we analyse the high dimensional behaviour of a recently discovered non-reversible Markov Chain Monte Carlo method called Bouncy Particle Sampler (BPS). This process is moving along straight lines with bounces occurring at a certain rate depending on the position to guarantee convergence to the target distribution. We have found that although BPS is moving along straight lines, in high dimensions, the behaviour of BPS tends to another process called Randomized Hamiltonian Monte Carlo (RHMC), which follows Hamiltonian equations (hence it is curved). We analyse the mixing properties of this limiting process, and show that it has dimension-free convergence rate for strongly log-concave target distributions. This understanding allows us to propose appropriate tuning parameters for the Bouncy Particle Sampler that improve its efficiency.

The figures below show the convergence of BPS towards RHMC by comparing the paths of BPS with the contours of the Hamiltonian energy (which is preserved by deterministic dynamics of RHMC and it is only changed when the jumps in velocity happen).