June 21th, 09:30 – 11:30am, Beijing International Convention Center.
See the ICML website for the general information about the conference.

Abstract: We provide a unified overview of several families of algorithms proposed in different settings: Frank-Wolfe (a.k.a. Conditional Gradient) algorithms, greedy optimization methods, and related extensions. Frank-Wolfe methods have been successfully applied to a wide range of large-scale learning and signal processing applications, such as matrix factorization, multi-task learning, image denoising, and structured prediction. On the other hand, greedy optimization algorithms, which underlie several versions of boosting, appear in structured variable selection, metric learning, and training of sum-product networks. 

All these algorithms have in common that they rely on the atomic decomposition of the variable of interest, that is expanding it as a linear combination of the elements of a dictionary. In this tutorial, we showcase these algorithms in a unified framework, and present simple proofs of convergence rates and illustrate their underlying assumptions.
We show how these families of algorithms relate to each other, illustrate several successful applications, and highlight current challenges.