JST-PRESTO Project

Efficient Learning Algorithm for Deep Learning based on the Understanding of the Implicit Regularization

Started from 2019

Atsushi Nitanda

Outline

The goal of this project is to reduce the huge cost for developing deep learning models, resulting in expanding a target domain where these models work well. To do so, I try to unravel the mystery of the implicit regularization ability of successful learning methods. Moreover, I am planning to develop new stable learning methods for the deep learning and network architecture search, which explicitly utilize this regularization ability.

Publications

[Oko et al. (ICLR2022)] Kazusato Oko, Taiji Suzuki, Atsushi Nitanda, and Denny Wu. Particle Stochastic Dual Coordinate Ascent: Exponential convergent algorithm for mean field neural network optimization. The 10th International Conference on Learning Representations (ICLR2022), 2022.

[Nitanda et al. (AISTATS2022)] Atsushi Nitanda, Denny Wu, and Taiji Suzuki. Convex Analysis of the Mean Field Langevin Dynamics. The 25th International Conference on Artificial Intelligence and Statistics (AISTATS2022), Proceedings of Machine Learning Research, ***:****--****, 2022.

[Suzuki et al. (NeurIPS2021)] Atsushi Suzuki, Atsushi Nitanda, Jing Wang, Linchuan Xu, Kenji Yamanishi, and Marc Cavazza. Generalization Bounds for Graph Embedding Using Negative Sampling: Linear vs Hyperbolic. In Advances in Neural Information Processing Systems 34 (NeurIPS2021), pp.1243--1255, 2021.

[Nitanda et al. (NeurIPS2021)] Atsushi Nitanda, Denny Wu, and Taiji Suzuki. Particle Dual Averaging: Optimization of Mean Field Neural Networks with Global Convergence Rate Analysis. In Advances in Neural Information Processing Systems 34 (NeurIPS2021), pp.19608--19621, 2021. [arXiv]

[Suzuki and Nitanda (NeurIPS2021)] Taiji Suzuki and Atsushi Nitanda. Deep learning is adaptive to intrinsic dimensionality of model smoothness in anisotropic Besov space. In Advances in Neural Information Processing Systems 34 (NeurIPS2021), pp.3609--3621, 2021. (Spotlight) [arXiv]

[Nitanda et al. (KAIS2021)] Atsushi Nitanda, Tomoya Murata, and Taiji Suzuki. Sharp Characterization of Optimal Minibatch Size for Stochastic Finite Sum Convex Optimization. Knowledge and Information Systems (KAIS), 63(9):2513--2539, 2021. (Journal version of ICDM2019 paper)

[Nitanda and Suzuki (ICLR2021)] Atsushi Nitanda and Taiji Suzuki. Optimal Rates for Averaged Stochastic Gradient Descent under Neural Tangent Kernel Regime. The 9th International Conference on Learning Representations (ICLR2021), 2021. (Outstanding paper award) [arXiv], [openreview]

[Amari et al. (ICLR2021)] Shun-ichi Amari, Jimmy Ba, Roger Grosse, Xuechen Li, Atsushi Nitanda, Taiji Suzuki, Denny Wu, and Ji Xu. When Does Preconditioning Help or Hurt Generalization?. The 9th International Conference on Learning Representations (ICLR2021), 2021. [arXiv], [openreview]

[Yashima et al. (AISTATS2021)] Shingo Yashima, Atsushi Nitanda, and Taiji Suzuki. Exponential Convergence Rates of Classification Errors on Learning with SGD and Random Features. The 24th International Conference on Artificial Intelligence and Statistics (AISTATS2021), Proceedings of Machine Learning Research, 130:1954—1962, 2021. [arXiv]

[Nitanda and Suzuki (AISTATS2020)] Atsushi Nitanda and Taiji Suzuki. Functional Gradient Boosting for Learning Residual-like Networks with Statistical Guarantees. The 23rd International Conference on Artificial Intelligence and Statistics (AISTATS2020), Proceedings of Machine Learning Research, 108:2981—2991, 2020.

[Nitanda et al. (ICDM2019)] Atsushi Nitanda, Tomoya Murata, and Taiji Suzuki. Sharp Characterization of Optimal Minibatch Size for Stochastic Finite Sum Convex Optimization. 2019 IEEE International Conference on Data Mining (ICDM2019), pp. 488--497. 2019. (Regular, Best Paper candidate for KAIS publication).

Survey Articles

二反田篤史．確率的最適化法の収束解析．第32回RAMPシンポジウム論文集，2020年．

二反田篤史．ニューラルネットワークの最適化理論．オペレーションズ・リサーチ， 65(12):643ー649, 2020．

二反田篤史．確率的勾配降下法とニューラルネットワーク．数理科学「統計的思考法のすすめ」, 58 (9):22ー28, サイエンス社，2020年9月．

References

[Nitanda and Suzuki (AISTATS2019)] Atsushi Nitanda and Taiji Suzuki. Stochastic Gradient Descent with Exponential Convergence Rates of Expected Classification Errors. The 22nd Artificial Intelligence and Statistics (AISTATS2019), Proceedings of Machine Learning Research, 89:1417—1426, 2019. (oral presentation) [arXiv] [slide]

[Nitanda and Suzuki (ICML2018)] Atsushi Nitanda and Taiji Suzuki. Functional Gradient Boosting based on Residual Network Perception. The 35th International Conference on Machine Learning (ICML2018), Proceedings of Machine Learning Research, 80:3819—3828, 2018. [arXiv] [code] [slide]

[Nitanda and Suzuki (AISTATS2018)] Atsushi Nitanda and Taiji Suzuki. Gradient Layer: Enhancing the Convergence of Adversarial Training for Generative Models. The 21st International Conference on Artificial Intelligence and Statistics (AISTATS2018), Proceedings of Machine Learning Research, 84: 1008—1016, 2018. [arXiv]

[Nitanda and Suzuki (2017)] Atsushi Nitanda and Taiji Suzuki. Stochastic Particle Gradient Descent for Infinite Ensemble. 2017. [arXiv]

Page updated

Google Sites

Report abuse