Atsushi Nitanda

Principal Scientist, Group Lead
Centre for Frontier AI and Research (CFAR), Institute of Advanced Intelligence and Computing (IAIC), Agency for Science, Technology and Research (A*STAR)
Office: 1 Fusionopolis Way, #16-16, Connexis (North Tower), Singapore 138632

Associate Professor
College of Computing and Data Science (CCDS), Nanyang Technological University (NTU)
Office: 50 Nanyang Avenue, N3-02c-75, Singapore 639798

E-mail:
(A*STAR) atsushi_nitanda [at] a-star.edu.sg
(NTU) atsushi.nitanda [at] ntu.edu.sg

Supported by JST PRESTO (Math-structure area, Oct. 2019 -- Mar. 2023)

Ph.D. in Information Science and Technology. The University of Tokyo. Supervisor: Taiji Suzuki

Research interests: stochastic optimization, distribution optimization, mean-field optimization, sampling, optimal transport, generative AI, statistical learning theory

Looking for Ph.D. students

I am looking for motivated Ph.D. students with a research inclination to conduct fundamental studies in the following or related topics:

Optimization and learning theory for machine learning and deep learning
Theoretical foundations of generative models, including LLMs and diffusion models
Theory of sampling methods, distribution optimization, and optimal transport

Several available scholarships are listed below.

A*STAR Graduate Scholarship
AISG PhD Fellowship Programme
NTU's scholarships (see also this page)
JASSO scholarship (for Japanese students, see also this page)

Please contact me if you are interested in a Ph.D. program under my supervision.

Members

Research scientists: David Bossens, Yueming Lyu, Ng Yan Bin, Wang Chong Xiao, Veeravalli Tanya
Research engineers: Poon Tze-Yang
Ph.D. students: Anzelle Lee (A*STAR, NUS)
Master students: Ajay Asaithambi (NTU)
Internship students: Dake Bu, Alexandre Mallez, Anming Gu
Former internship students: Lim Yee Wee, Zhu Jia-Xin, Damian Tan, Guillaume Wang, Tiffany Mun, Kotaro Yoshida, Hirohane Takagi, Ryotaro Kawata

* denotes alphabetical ordering below

Conference Papers (Refereed)

1. Dake Bu, Wei Huang, Andi Han, Atsushi Nitanda, Hau-San Wong, Qingfu Zhang, and Taiji Suzuki. Provable Sample Efficiency of Curriculum Post-Training for Transformer Reasoning. The 43rd International Conference on Machine Learning (ICML2026), Proceedings of Machine Learning Research, ***:*****--*****, 2026. [arXiv]
2. Hirohane Takagi and Atsushi Nitanda. Alternating Diffusion for Proximal Sampling with Zeroth Order Queries. The 14th International Conference on Learning Representations (ICLR2026), 2026. [openreview] [arXiv]
3. Futoshi Futami and Atsushi Nitanda. Smooth Calibration Error: Uniform Convergence and Functional Gradient Analysis. The 14th International Conference on Learning Representations (ICLR2026), 2026. [openreview]
4. Ibuki Maeda, Rentian Yao, and Atsushi Nitanda. Statistical Analysis of the Sinkhorn Iterations for Two-Sample Schrödinger Bridge Estimation. The 39th Annual Conference on Neural Information Processing Systems (NeurIPS2025), In Advances in Neural Information Processing Systems, 38:118751--118786, 2025. [arXiv]
5. Atsushi Nitanda, Anzelle Lee, Damian Tan Xing Kai, Mizuki Sakaguchi, and Taiji Suzuki. Propagation of Chaos for Mean-Field Langevin Dynamics and its Application to Model Ensemble. The 42nd International Conference on Machine Learning (ICML2025), Proceedings of Machine Learning Research, 267:46586--46610, 2025. [arXiv]
6. Dake Bu, Wei Huang, Andi Han, Atsushi Nitanda, Taiji Suzuki, Qingfu Zhang, and Hau-San Wong. Provable In-Context Vector Arithmetic via Retrieving Task Concepts. The 42nd International Conference on Machine Learning (ICML2025), Proceedings of Machine Learning Research, 267:5669--5724, 2025.
7. Ryotaro Kawata, Kazusato Oko, Atsushi Nitanda, and Taiji Suzuki. Direct Distributional Optimization for Provable Alignment of Diffusion Models. The 13th International Conference on Learning Representations (ICLR2025), 2025. [arXiv]
8. Tomoya Murata, Atsushi Nitanda, and Taiji Suzuki. Clustered Invariant Risk Minimization. The 28th International Conference on Artificial Intelligence and Statistics (AISTATS2025), Proceedings of Machine Learning Research, 258:1612--1620, 2025.
9. Atsushi Nitanda. Improved Particle Approximation Error for Mean Field Neural Networks. The 38th Annual Conference on Neural Information Processing Systems (NeurIPS2024), In Advances in Neural Information Processing Systems, 37:113823--113845, 2024. [arXiv]
10. Dake Bu, Wei Huang, Andi Han, Atsushi Nitanda, Taiji Suzuki, Qingfu Zhang, Hau-San Wong. Provably Transformers Harness Multi-Concept Word Semantics for Efficient In-Context Learning. The 38th Annual Conference on Neural Information Processing Systems (NeurIPS2024), In Advances in Neural Information Processing Systems, 37:63342--63405, 2024. [arXiv]
11. Atsushi Nitanda, Ryuhei Kikuchi, Shugo Maeda, and Denny Wu. Why is Parameter Averaging Beneficial in SGD? An Objective Smoothing Perspective. The 27th International Conference on Artificial Intelligence and Statistics (AISTATS2024), Proceedings of Machine Learning Research, 238: 3565–-3573, 2024. [arXiv]
12. Yuka Hashimoto, Sho Sonoda, Isao Ishikawa, Atsushi Nitanda, and Taiji Suzuki. Koopman-Based Bound for Generalization: New Aspect of Neural Networks Regarding Nonlinear Noise Filtering. The 12th International Conference on Learning Representations (ICLR2024), 2024. [arXiv] [openreview]
13. Atsushi Nitanda*, Kazusato Oko*, Taiji Suzuki*, and Denny Wu*. Improved Statistical and Computational Complexity of the Mean-field Langevin Dynamics under Structured Data. The 12th International Conference on Learning Representations (ICLR2024), 2024. [openreview]
14. Taiji Suzuki, Denny Wu, Kazusato Oko, and Atsushi Nitanda. Feature Learning via Mean-field Langevin Dynamics: Classifying Sparse Parities and Beyond. The 37th Annual Conference on Neural Information Processing Systems (NeurIPS2023), In Advances in Neural Information Processing Systems, 36:34536--34556, 2023.
15. Taiji Suzuki, Denny Wu, and Atsushi Nitanda. Convergence of Mean-field Langevin Dynamics: Time and Space Discretization, Stochastic Gradient, and Variance Reduction. The 37th Annual Conference on Neural Information Processing Systems (NeurIPS2023), In Advances in Neural Information Processing Systems, 36:15545--15577, 2023. (Spotlight) [arXiv]
16. Atsushi Suzuki, Atsushi Nitanda, Taiji Suzuki, Jing Wang, Feng Tian, and Kenji Yamanishi. Tight and Fast Generalization Error Bound of Graph Embedding in Metric Space. The 40th International Conference on Machine Learning (ICML2023), Proceedings of Machine Learning Research, 202:33268--33284, 2023.
17. Atsushi Nitanda, Kazusato Oko, Denny Wu, Nobuhito Takenouchi, and Taiji Suzuki. Primal and Dual Analysis of Entropic Fictitious Play for Finite-sum Problems. The 40th International Conference on Machine Learning (ICML2023), Proceedings of Machine Learning Research, 202:26266--26282, 2023. [arXiv]
18. Taiji Suzuki, Atsushi Nitanda, and Denny Wu. Uniform-in-time Propagation of Chaos for the Mean Field Gradient Langevin Dynamics. The 11th International Conference on Learning Representations (ICLR2023), 2023. [openreview]
19. Naoki Nishikawa, Taiji Suzuki, Atsushi Nitanda, and Denny Wu. Two-layer Neural Network on Infinite Dimensional Data: Global Optimization Guarantee in the Mean-field Regime. The 36th Annual Conference on Neural Information Processing Systems (NeurIPS2022), In Advances in Neural Information Processing Systems, 35:32612--32623, 2022.
20. Kazusato Oko, Taiji Suzuki, Atsushi Nitanda, and Denny Wu. Particle Stochastic Dual Coordinate Ascent: Exponential convergent algorithm for mean field neural network optimization. The 10th International Conference on Learning Representations (ICLR2022), 2022. [openreview]
21. Atsushi Nitanda, Denny Wu, and Taiji Suzuki. Convex Analysis of the Mean Field Langevin Dynamics. The 25th International Conference on Artificial Intelligence and Statistics (AISTATS2022), Proceedings of Machine Learning Research, 151:9741--9757, 2022. [arXiv]
22. Atsushi Suzuki, Atsushi Nitanda, Jing Wang, Linchuan Xu, Kenji Yamanishi, and Marc Cavazza. Generalization Bounds for Graph Embedding Using Negative Sampling: Linear vs Hyperbolic. The 35th Annual Conference on Neural Information Processing Systems (NeurIPS2021), In Advances in Neural Information Processing Systems, 34:1243--1255, 2021.
23. Atsushi Nitanda, Denny Wu, and Taiji Suzuki. Particle Dual Averaging: Optimization of Mean Field Neural Networks with Global Convergence Rate Analysis. The 35th Annual Conference on Neural Information Processing Systems (NeurIPS2021), In Advances in Neural Information Processing Systems, 34:19608--19621, 2021. [arXiv], [openreview]
24. Taiji Suzuki and Atsushi Nitanda. Deep Learning is Adaptive to Intrinsic Dimensionality of Model Smoothness in Anisotropic Besov Space. The 35th Annual Conference on Neural Information Processing Systems (NeurIPS2021), In Advances in Neural Information Processing Systems, 34:3609--3621, 2021. (Spotlight) [arXiv]
25. Atsushi Suzuki, Atsushi Nitanda, Jing Wang, Linchuan Xu, Kenji Yamanishi, and Marc Cavazza. Generalization Error Bound for Hyperbolic Ordinal Embedding. The 38th International Conference on Machine Learning (ICML2021), Proceedings of Machine Learning Research, 139:10011--10021, 2021.
26. Atsushi Nitanda and Taiji Suzuki. Optimal Rates for Averaged Stochastic Gradient Descent under Neural Tangent Kernel Regime. The 9th International Conference on Learning Representations (ICLR2021), 2021. (Outstanding Paper Award) [arXiv], [openreview] (8 papers out of 860 accepted papers, 2997 submissions)
27. Shun-ichi Amari*, Jimmy Ba*, Roger Grosse*, Xuechen Li*, Atsushi Nitanda*, Taiji Suzuki*, Denny Wu*, and Ji Xu*. When Does Preconditioning Help or Hurt Generalization?. The 9th International Conference on Learning Representations (ICLR2021), 2021. [arXiv], [openreview]
28. Shingo Yashima, Atsushi Nitanda, and Taiji Suzuki. Exponential Convergence Rates of Classification Errors on Learning with SGD and Random Features. The 24th International Conference on Artificial Intelligence and Statistics (AISTATS2021), Proceedings of Machine Learning Research, 130:1954--1962, 2021. [arXiv]
29. Atsushi Nitanda and Taiji Suzuki. Functional Gradient Boosting for Learning Residual-like Networks with Statistical Guarantees. The 23rd International Conference on Artificial Intelligence and Statistics (AISTATS2020), Proceedings of Machine Learning Research, 108:2981--2991, 2020.
30. Atsushi Suzuki, Jing Wang, Feng Tian, Atsushi Nitanda, and Kenji Yamanishi. Hyperbolic Ordinal Embedding. The 11th Asian Conference on Machine Learning (ACML2019), Proceedings of Machine Learning Research, 101:1065--1080, 2019.
31. Satoshi Hara, Atsushi Nitanda, and Takanori Maehara. Data Cleansing for Models Trained with SGD. The 33rd Annual Conference on Neural Information Processing Systems (NeurIPS2019), In Advances in Neural Information Processing Systems, 32:4213--4222, 2019. [arXiv].
32. Atsushi Nitanda, Tomoya Murata, and Taiji Suzuki. Sharp Characterization of Optimal Minibatch Size for Stochastic Finite Sum Convex Optimization. 2019 IEEE International Conference on Data Mining (ICDM2019), pp. 488--497. 2019. (Regular, Best Paper candidate for KAIS publication) [slide]
33. Atsushi Nitanda and Taiji Suzuki. Stochastic Gradient Descent with Exponential Convergence Rates of Expected Classification Errors. The 22nd Artificial Intelligence and Statistics (AISTATS2019), Proceedings of Machine Learning Research, 89:1417--1426, 2019. (Oral presentation) [arXiv] [slide]
34. Atsushi Nitanda and Taiji Suzuki. Functional Gradient Boosting based on Residual Network Perception. The 35th International Conference on Machine Learning (ICML2018), Proceedings of Machine Learning Research, 80:3819--3828, 2018. [arXiv] [code] [slide]
35. Atsushi Nitanda and Taiji Suzuki. Gradient Layer: Enhancing the Convergence of Adversarial Training for Generative Models. The 21st International Conference on Artificial Intelligence and Statistics (AISTATS2018), Proceedings of Machine Learning Research, 84: 1008--1016, 2018. [arXiv]
36. Atsushi Nitanda and Taiji Suzuki. Stochastic Difference of Convex Algorithm and its Application to Training Deep Boltzmann Machines. The 20th International Conference on Artificial Intelligence and Statistics (AISTATS2017), Proceedings of Machine Learning Research, 54:470--478, 2017.
37. Atsushi Nitanda. Accelerated Stochastic Gradient Descent for Minimizing Finite Sums. The 19th International Conference on Artificial Intelligence and Statistics (AISTATS2016), Proceedings of Machine Learning Research, 51:195--203, 2016. [arXiv]
38. Atsushi Nitanda. Stochastic Proximal Gradient Descent with Acceleration Techniques. The 28th Annual Conference on Neural Information Processing Systems (NeurIPS(NIPS)2014), In Advances in Neural Information Processing Systems, 27:1574--1582, 2014.

Journal Articles

1. Sinho Chewi*, Atsushi Nitanda*, and Matthew S. Zhang*. Uniform-in-N log-Sobolev inequality for the mean-field Langevin dynamics with convex energy. SIAM Journal on Mathematical Analysis (SIMA). ****(**):*****, 2026. [arXiv] (accepted)
2. Yuangang Pan, Yinghua Yao, Atsushi Nitanda, Joey Tianyi Zhou, and Ivor Tsang. Auto-clustering with Continuous Distribution Estimation on Centroids. Machine Learning, 115(4):86, 2025. (Journal track of the 17th Asian Conference on Machine Learning (ACML 2025))
3. David Bossens and, Atsushi Nitanda. Mirror Descent Policy Optimisation for Robust Constrained Markov Decision Processes. Transactions on Machine Learning Research (TMLR), 2025. (Journal-to-Conference Certification) [arXiv] [openreview]
4. Cheng Chen, Atsushi Nitanda, and Ivor Tsang. Unlearning Misalignment for Personalized LLM Adaptation via Instance-Response-Dependent Discrepancies. Transactions on Machine Learning Research (TMLR), 2025.
5. Naoki Nishikawa, Taiji Suzuki, Atsushi Nitanda, and Denny Wu. Two-layer Neural Network on Infinite-dimensional Data: Global Optimization Guarantee in the mean-field regime. Journal of Statistical Mechanics: Theory and Experiment (JSTAT), 2023(11):114007, 2023. (Journal version of NeurIPS2022 paper)
6. Atsushi Nitanda, Denny Wu, and Taiji Suzuki. Particle Dual Averaging: Optimization of Mean Field Neural Networks with Global Convergence Rate Analysis. Journal of Statistical Mechanics: Theory and Experiment (JSTAT), 2022(11):114010, 2022. (Journal version of NeurIPS2021 paper)
7. Atsushi Nitanda, Tomoya Murata, and Taiji Suzuki. Sharp Characterization of Optimal Minibatch Size for Stochastic Finite Sum Convex Optimization. Knowledge and Information Systems (KAIS), 63(9):2513--2539, 2021. (Journal version of ICDM2019 paper)
8. Atsushi Nitanda. The Growth of the Nevanlinna Proximity Function. Journal of Mathematical Sciences, 16(4):525--543, 2009.

Technical Reports

Tanya Veeravalli, David M. Bossens, and Atsushi Nitanda. Policy Gradient for Continuous-Time Robust Markov Decision Processes. 2026. [arXiv]
Atsushi Nitanda, Dake Bu, Yueming Lyu, Tanya Veeravalli. Slowly Annealed Langevin Dynamics: Theory and Applications to Training-Free Guided Generation. 2026. [arXiv] [code]
Zonghao Chen, Atsushi Nitanda, Arthur Gretton, and Taiji Suzuki. Towards a Unified Analysis of Neural Networks in Nonparametric Instrumental Variable Regression: Optimization and Generalization. 2025. [arXiv]
Dake Bu, Wei Huang, Andi Han, Atsushi Nitanda, Bo Xue, Qingfu Zhang, Hau-San Wong, and Taiji Suzuki. Consistency Is Not Always Correct: Towards Understanding the Role of Exploration in Post-Training Reasoning. 2025. [arXiv]
Kotaro Yoshida and Atsushi Nitanda. How Does Preconditioning Guide Feature Learning in Deep Neural Networks?. 2025. [arXiv]
Yueming Lyu, Atsushi Nitanda, Ivor W. Tsang. Nonparametric Distributional Black-box Optimization via Diffusion Process. ICLR 2025 Workshop on Deep Generative Model in Machine Learning: Theory, Principle and Efficacy, 2025. [openreview]
Rentian Yao, Atsushi Nitanda, Xiaohui Chen, and Yun Yang. Learning Density Evolution from Snapshot Data. 2025. [arXiv]
Yuto Mori, Atsushi Nitanda, and Akiko Takeda. BODAME: Bilevel Optimization for Defense Against Model Extraction. 2021. [arXiv]
Linchuan Xu, Jun Huang, Atsushi Nitanda, Ryo Asaoka, and Kenji Yamanishi. A Novel Global Spatial Attention Mechanism in Convolutional Neural Network for Medical Image Classification. 2020. [arXiv]
Shintaro Fukushima, Atsushi Nitanda, Kenji Yamanishi. Online Robust and Adaptive Learning from Data Streams. 2020. [arXiv]
Atsushi Nitanda, Geoffrey Chinot, and Taiji Suzuki. Gradient Descent can Learn Less Over-parameterized Two-layer Neural Networks on Classification Problems. 2019. [arXiv]
Atsushi Nitanda and Taiji Suzuki. Stochastic Particle Gradient Descent for Infinite Ensemble. 2017. [arXiv]

Notes

Atsushi Nitanda. Note: Noise Conditions and Convergence Analysis of SGD under Polyak-Lojasiewicz Inequality. 2022. [link]

Doctor Thesis

Efficient Machine Learning from Gradient Method Perspective in Finite and Infinite Dimensional Spaces

Professional Activities

Reviewer: NIPS, NeurIPS, ICML, AISTATS, ICLR, IJCAI, ICPR, JMLR, AoS, IEEE TNNLS/TSP/SPL, IEICE, Neural Networks, Signal Processing.
Selected as a top reviewer at NeurIPS 2019 and ICML 2020. Selected as a TMLR expert reviewer in 2023.

Action Editor / Area Chair: TMLR (Jul. 2023--), ICML, NeurIPS, ICLR.

Editorial Board Reviewer: JMLR (Aug. 2020--).

Editorial Board: IEICE Trans. (Jun. 2023--).

Committee: IBISML (Jun. 2022--Jun. 2024).

Organizing Committee: IBIS WS Program Committee (2020, 2024), CAI Workflow Co-chair (2024).

Awards

Atsushi Nitanda and Taiji Suzuki. Outstanding Paper Award, The Ninth International Conference on Learning Representations (ICLR), 2021.
Atsushi Nitanda, Tomoya Murata, and Taiji Suzuki. ICDM '19 Best Paper Candidate for KAIS Publication, IEEE International Conference on Data Mining (ICDM), 2019.
Satoshi Hara, Atsushi Nitanda, and Takanori Maehara. Best Presentation Award, The 22nd Information-Based Induction Sciences Workshop (IBIS2019), 2019.
Atsushi Nitanda. Dean's Awards, Graduate School of Information Science and Technology, the University of Tokyo, 2019.
Atsushi Nitanda. Dean's Awards, Graduate School of Mathematical Sciences, the University of Tokyo, 2009.

Bio

Atsushi Nitanda is a Principal Scientist and Group Lead at A*STAR CFAR, where he leads the Optimization in AI group. He is also an Associate Professor at Nanyang Technological University. Prior to his current position, he was an Associate Professor at the Kyushu Institute of Technology and an Assistant Professor at the University of Tokyo. Previously, he worked at NTT DATA Mathematical Systems Inc. (MSI) as a researcher. He obtained his Ph.D. in Information Science and Technology from the University of Tokyo in 2018. His research interests include stochastic optimization, distribution optimization, and statistical learning theory. He received the Outstanding Paper Award at ICLR in 2021 and the Dean’s Awards for doctoral and master's theses from the University of Tokyo in 2019 and 2009.

Google Sites

Report abuse