Modern Trends in Nonconvex Optimization for Machine Learning
Accepted Papers
[Best paper award] Algorithmic Regularization in Learning Deep Homogeneous Models: Layers are Automatically Balanced. Simon S. Du, Wei Hu, and Jason D. Lee. [paper] [sup].
Incremental Consensus based Collaborative Deep Learning. Zhanhong Jiang, Aditya Balu, Chinmay Hegde, Soumik Sarkar. [paper] [sup]
Escaping Undesired Stationary Points in Local Saddle Point Optimization: A Curvature Exploitation Approach. Leonard Adolphs, Hadi Daneshmand, Aurelien Lucchi, Thomas Hofmann. [paper] [sup]
Using Mode Connectivity for Loss Landscape Analysis. Akhilesh Gotmare, Nitish Shirish Keskar, Caiming Xiong, Richard Socher. [paper] [sup]
Distributionally Robust Submodular Maximization. Matthew Staib, Bryan Wilder, and Stefanie Jegelka. [paper] [sup]
Efficient algorithms for robust submodular maximization under matroid constraints. Sebastian Pokutta, Mohit Singh, and Alfredo Torrico. [paper]
Block Mean Approximation for Efficient Second Order Optimization. Yao Lu, Mehrtash Harandi, Richard Hartley, Razvan Pascanu. [paper] [sup]
Online Generalized Eigenvector Estimation. Junchi Li, Qiang Sun, Tong Zhang. [paper] [sup]
The Case for Full-Matrix Adaptive Regularization. Naman Agarwal, Brian Bullins, Xinyi Chen, Elad Hazan, Karan Singh, Cyril Zhang, and Yi Zhang. [paper] [sup]
DNN's Sharpest Directions Along the SGD Trajectory. Stanisław Jastrzębski, Zac Kenton, Nicolas Ballas, Asja Fischer, Yoshua Bengio, and Amos Storkey. [paper] [sup]
Inferring the Group Lasso Structure via Bilevel Optimization. Jordan Frecon, Saverio Salzo, and Massimiliano Pontil. [paper] [sup]
Robust Learning of Trimmed Estimators via Manifold Sampling. Matt Menickelly, Stefan Wild. [paper] [sup]
Regularizing tensor decomposition methods by optimizing pseudo-data. Omer Gottesman, Finale Doshi-Velez. [paper] [sup]
Learning Kolmogorov Models for Binary Random Variables. Hadi Ghauch, Mikael Skoglund, Hossein Shokri-Ghadikolaei, Carlo Fischione and Ali Sayed.[paper] [sup]
Efficient Dictionary Learning with Gradient Descent. Dar Gilboa, Sam Buchanan, John Wright. [paper] [sup]
Escaping saddle points efficiently in equality-constrained optimization problems. Yue Sun, Maryam Fazel. [paper] [sup]
Fast Algorithms for Sparse Reduced-Rank Regression. Benjamin Dubois, Jean-François Delmas, Guillaume Obozinski. [paper] [sup]
Gradient Descent with Random Initialization: Fast Global Convergence for Nonconvex Phase Retrieval. Yuxin Chen, Yuejie Chi, Jianqing Fan, and Cong Ma. [paper] [sup]
Defending Against Saddle Point Attack in Byzantine-Robust Distributed Learning. Dong Yin, Yudong Chen, Kannan Ramchandran, and Peter Bartlett. [paper]
Convergence guarantees for RMSProp and ADAM in non-convex optimization and their comparison to Nesterov acceleration on autoencoders. Amitabh Basu, Soham De, Anirbit Mukherjee, and Enayat Ullah. [paper] [sup]
Lifted Recurrent Neural Networks. Rajiv Sambharya, Armin Askari, Geoffrey Negiar, and Laurent El Ghaoui. [paper] [sup]
Composite Non-Convex Optimization Using Gradient Method With Inexact Oracle. Pavel Dvurechensky. [paper]
Relaxed Cyclic Douglas-Rachford Algorithms for Nonconvex Optimization. D. Russell Luke, Anna-Lena Martins and Matthew K. Tam. [paper]
Scalable Natural Gradient Langevin Dynamics in Practice. Henri Palacci, Henry Hess. [paper]
A first-order augmented Lagrangian framework for nonconvex optimization with nonlinear constraints. Bang Cong Vu, Ahmet Alacaoglu, Mehmet Fatih Sahin, Alp Yurtsever, Volkan Cevher. [paper] [sup]
Robust Principal Component Analysis using Facial Reduction. Shiqian Ma, Fei Wang, Linchuan Wei, Henry Wolkowicz. [paper]
On the Convergence of Block-Coordinate Maximization for Burer-Monteiro Method. Murat Erdogdu, Asuman Ozdaglar, Pablo Parrilo, and Nuri Vanli. [paper]
Negative Momentum for Improved Game Dynamics. Gauthier Gidel, Reyhane AskariHemmat, Mohammad Pezeshki, Gabriel Huang, Rémi Lepriol, Simon Lacoste-Julien, and Ioannis Mitliagkas. [paper] [sup]
The Goldilocks zone: Empirical exploration of the structure of the neural network loss landscapes. Stanislav Fort. [paper]
Uniform Convergence of Gradients for Non-Convex Learning and Optimization. Dylan J. Foster, Ayush Sekhari, Karthik Sridharan. [paper] [sup]
On the Analysis of Trajectories of Gradient Descent in the Optimization of Deep Neural Networks. Adepu Ravi Sankar, Vishwak Srinivasan, and VIneeth N Balasubramanian. [paper] [sup]
Improved Generalization with Curvature Regularization. Xinyang Geng, Lechao Xiao, Hossein Mobahi, Jeffrey Pennington. [paper]
Provably Fast Convergence of Batch Normalization on Learning Halfspaces under Gaussian Inputs. Jonas Kohler, Hadi Daneshmand, Aurelien Lucchi, Ming Zhou, Klaus Neymeyr, Thomas Hofmann. [paper] [sup]
Adding One Neuron Can Eliminate All Bad Local Minima. Shiyu Liang, Ruoyu Sun, Jason D. Lee, R. Srikant. [paper] [sup]