Modern Trends in Nonconvex Optimization for Machine Learning

Accepted Papers

  1. Algorithmic Regularization in Learning Deep Homogeneous Models: Layers are Automatically Balanced. Simon S. Du, Wei Hu, and Jason D. Lee. [paper] [sup]
  2. Incremental Consensus based Collaborative Deep Learning. Zhanhong Jiang, Aditya Balu, Chinmay Hegde, Soumik Sarkar. [paper] [sup]
  3. Escaping Undesired Stationary Points in Local Saddle Point Optimization: A Curvature Exploitation Approach. Leonard Adolphs, Hadi Daneshmand, Aurelien Lucchi, Thomas Hofmann. [paper] [sup]
  4. Using Mode Connectivity for Loss Landscape Analysis. Akhilesh Gotmare, Nitish Shirish Keskar, Caiming Xiong, Richard Socher. [paper] [sup]
  5. Distributionally Robust Submodular Maximization. Matthew Staib, Bryan Wilder, and Stefanie Jegelka. [paper] [sup]
  6. Efficient algorithms for robust submodular maximization under matroid constraints. Sebastian Pokutta, Mohit Singh, and Alfredo Torrico. [paper]
  7. Block Mean Approximation for Efficient Second Order Optimization. Yao Lu, Mehrtash Harandi, Richard Hartley, Razvan Pascanu. [paper] [sup]
  8. Online Generalized Eigenvector Estimation. Junchi Li, Qiang Sun, Tong Zhang. [paper] [sup]
  9. The Case for Full-Matrix Adaptive Regularization. Naman Agarwal, Brian Bullins, Xinyi Chen, Elad Hazan, Karan Singh, Cyril Zhang, and Yi Zhang. [paper] [sup]
  10. DNN's Sharpest Directions Along the SGD Trajectory. Stanisław Jastrzębski, Zac Kenton, Nicolas Ballas, Asja Fischer, Yoshua Bengio, and Amos Storkey. [paper] [sup]
  11. Inferring the Group Lasso Structure via Bilevel Optimization. Jordan Frecon, Saverio Salzo, and Massimiliano Pontil. [paper] [sup]
  12. Robust Learning of Trimmed Estimators via Manifold Sampling. Matt Menickelly, Stefan Wild. [paper] [sup]
  13. Regularizing tensor decomposition methods by optimizing pseudo-data. Omer Gottesman, Finale Doshi-Velez. [paper] [sup]
  14. Learning Kolmogorov Models for Binary Random Variables. Hadi Ghauch, Mikael Skoglund, Hossein Shokri-Ghadikolaei, Carlo Fischione and Ali Sayed.[paper] [sup]
  15. Efficient Dictionary Learning with Gradient Descent. Dar Gilboa, Sam Buchanan, John Wright. [paper] [sup]
  16. Escaping saddle points efficiently in equality-constrained optimization problems. Yue Sun, Maryam Fazel. [paper] [sup]
  17. Fast Algorithms for Sparse Reduced-Rank Regression. Benjamin Dubois, Jean-François Delmas, Guillaume Obozinski. [paper] [sup]
  18. Gradient Descent with Random Initialization: Fast Global Convergence for Nonconvex Phase Retrieval. Yuxin Chen, Yuejie Chi, Jianqing Fan, and Cong Ma. [paper] [sup]
  19. Defending Against Saddle Point Attack in Byzantine-Robust Distributed Learning. Dong Yin, Yudong Chen, Kannan Ramchandran, and Peter Bartlett. [paper]
  20. Convergence guarantees for RMSProp and ADAM in non-convex optimization and their comparison to Nesterov acceleration on autoencoders. Amitabh Basu, Soham De, Anirbit Mukherjee, and Enayat Ullah. [paper] [sup]
  21. Lifted Recurrent Neural Networks. Rajiv Sambharya, Armin Askari, Geoffrey Negiar, and Laurent El Ghaoui. [paper] [sup]
  22. Composite Non-Convex Optimization Using Gradient Method With Inexact Oracle. Pavel Dvurechensky. [paper]
  23. Relaxed Cyclic Douglas-Rachford Algorithms for Nonconvex Optimization. D. Russell Luke, Anna-Lena Martins and Matthew K. Tam. [paper]
  24. Scalable Natural Gradient Langevin Dynamics in Practice. Henri Palacci, Henry Hess. [paper]
  25. A first-order augmented Lagrangian framework for nonconvex optimization with nonlinear constraints. Bang Cong Vu, Ahmet Alacaoglu, Mehmet Fatih Sahin, Alp Yurtsever, Volkan Cevher. [paper] [sup]
  26. Robust Principal Component Analysis using Facial Reduction. Shiqian Ma, Fei Wang, Linchuan Wei, Henry Wolkowicz. [paper]
  27. On the Convergence of Block-Coordinate Maximization for Burer-Monteiro Method. Murat Erdogdu, Asuman Ozdaglar, Pablo Parrilo, and Nuri Vanli. [paper]
  28. Negative Momentum for Improved Game Dynamics. Gauthier Gidel, Reyhane AskariHemmat, Mohammad Pezeshki, Gabriel Huang, Rémi Lepriol, Simon Lacoste-Julien, and Ioannis Mitliagkas. [paper] [sup]
  29. The Goldilocks zone: Empirical exploration of the structure of the neural network loss landscapes. Stanislav Fort. [paper]
  30. Uniform Convergence of Gradients for Non-Convex Learning and Optimization. Dylan J. Foster, Ayush Sekhari, Karthik Sridharan. [paper] [sup]
  31. On the Analysis of Trajectories of Gradient Descent in the Optimization of Deep Neural Networks. Adepu Ravi Sankar, Vishwak Srinivasan, and VIneeth N Balasubramanian. [paper] [sup]
  32. Improved Generalization with Curvature Regularization. Xinyang Geng, Lechao Xiao, Hossein Mobahi, Jeffrey Pennington. [paper]
  33. Provably Fast Convergence of Batch Normalization on Learning Halfspaces under Gaussian Inputs. Jonas Kohler, Hadi Daneshmand, Aurelien Lucchi, Ming Zhou, Klaus Neymeyr, Thomas Hofmann. [paper] [sup]
  34. Adding One Neuron Can Eliminate All Bad Local Minima. Shiyu Liang, Ruoyu Sun, Jason D. Lee, R. Srikant. [paper] [sup]