Beyond First order methods in ML Systems
Friday, July 17th - Timezone: PDT
Description: In the last few decades, much effort has been devoted to the development of first-order methods. These methods enjoy a low per-iteration cost and have optimal complexity, are easy to implement, and have proven to be effective for most machine learning applications.
In contrast, higher-order methods, such as Newton, quasi-Newton and adaptive gradient descent methods, are extensively used in many scientific and engineering domains. At least in theory, these methods possess several nice features: they exploit local curvature information to mitigate the effects of ill-conditioning, they avoid or diminish the need for hyper-parameter tuning, and they have enough concurrency to take advantage of distributed computing environments. However, often higher-order methods are “undervalued.”
This workshop will attempt to shed light on this statement. Topics of interest include, but are not limited to, second-order methods, adaptive gradient descent methods, regularization techniques, as well as techniques based on higher-order derivatives. This workshop will bring machine learning and optimization researchers closer, in order to facilitate a discussion with regards to underlying questions such as the following:
Why are they not omnipresent?
Why are higher-order methods important in machine learning?
What advantages can they offer? What are their limitations and disadvantages?
How should (or could) they be implemented in practice?
Plenary speakers
Industry Panel
Organizers