Publications 

Selected Publications

Abstract: Sampling conditional distributions is a fundamental task for Bayesian inference and density estimation. Generative models, such as normalizing flows and generative adversarial networks, characterize conditional distributions by learning a transformation that transports a simple reference (e.g., a standard Gaussian) to a target distribution. While these approaches successfully describe many non-Gaussian problems, their performance is often limited by parametric bias and the reliability of gradient-based (adversarial) optimizers to learn these transformations. This work proposes a non-parametric generative model that iteratively maps samples between the reference and target distributions. Our formulation solves the optimal transport problem by minimizing a weighted cost function that yields block-triangular transport maps, thereby extending the approach in Trigila and Tabak [2016] to conditional sampling. The proposed approach is demonstrated on two dimensional example and on a parameter inference problem involving nonlinear ODEs.

Essid, Montacer, Esteban G. Tabak, and Giulio Trigila. An implicit gradient-descent procedure for minimax problems. Mathematical Methods of Operations Research 97.1 (2023): 57-89.

Abstract: A game theory inspired methodology is proposed for finding a function’s saddle points. While explicit descent methods are known to have severe convergence issues, implicit methods are natural in an adversarial setting, as they take the other player’s optimal strategy into account. The implicit scheme proposed has an adaptive learning rate that makes it transition to Newton’s method in the neighborhood of saddle points. Convergence is shown through local analysis and through numerical examples in optimal transport and linear programming. An ad-hoc quasi-Newton method is developed for high dimensional problems, for which the inversion of the Hessian of the objective function may entail a high computational cost.

Tabak, E., Trigila, G., & Zhao, W. (2022). “Distributional Barycenter Problem through data-driven flows”. Pattern Recognition, Elsevier 108795.


Abstract:  A new method is proposed for the solution of the data-driven optimal transport barycenter problem and of the more general distributional barycenter problem that the article introduces. The method improves on previous approaches based on adversarial games, by slaving the discriminator to the generator, minimizing the need for parameterizations and by allowing the adoption of general cost functions. It is applied to numerical examples, which include analyzing the MNIST data set with a new cost function that penalizes non-isometric maps. 

Pavon, Michele, Giulio Trigila, and Esteban G. Tabak. The Data‐Driven Schrödinger Bridge. Communications on Pure and Applied Mathematics 74.7 (2021): 1545-1573.




Abstract: Erwin Schroedinger posed, and to a large extent solved in 1931/32 the problem ¨ of finding the most likely random evolution between two continuous probability distributions. This article considers this problem in the case when only samples of the two distributions are available. A novel iterative procedure is proposed, inspired by Fortet-IPF-Sinkhorn type algorithms. Since only samples of the marginals are available, the new approach features constrained maximum likelihood estimation in place of the nonlinear boundary couplings. This method mitigates the curse of dimensionality, compared to the introduction of grids which in high dimensions lead to numerically unfeasible methods. The methodology is illustrated in two applications: entropic interpolation of two-dimensional Gaussian mixtures, and the estimation of integrals through a variation of importance sampling.

Abstract: A family of normalizing flows is introduced for selectively removingfrom a data set the variability attributable to a specific set of cofactors,while preserving the dependence on others. This is achieved by extendingthe barycenter problem of optimal transport theory to the newly intro-duced conditional barycenter problem. Rather than summarizing the data with a single probability distribution, as in the classical barycenter problem, the conditional barycenter is represented by a family of distributions labeled by the cofactors kept. The use of the conditional barycenter and its differences with the classical barycenter are illustrated on synthetic and real data addressing treatment effect estimation, super-resolution,anomaly detection and lightness transfer in image analysis.

Under Review