I am an assistant professor in the Department of Artificial Intelligence at Korea University. I received my Ph.D. degree at KAIST, advised by Jinwoo Shin and was a postdoctoral fellow at Vector Institute for Artificial Intelligence, hosted by Murat A. Erdogdu. My current research interest is the theory of deep learning, e.g., analyzing expressivity, optimization properties, and generalization properties of machine learning models. I also worked on network pruning, probabilistic graphical models, Markov chain Monte Carlo, and combinatorial optimization.
Research interest
The theory of machine learning has been developed under exact arithmetic and real input/output/parameters. However, real-world implementations of ML models use machine arithmetic (e.g., fixed-point/floating-point arithmetic), which can only represent a finite subset of reals and perform inexact operations due to the round-off errors. I am interested in mathematically understanding real implementations of ML models under machine arithmetic. Specifically, I am currently working on the following research topics (but not limited to).
Understanding the expressive power of neural networks
Classical universal approximation theorems state that two-layer neural networks using a non-polynomial activation function can approximate any continuous function within an arbitrary error. Can networks using machine arithmetic represent any function from/to machine-representable numbers?
Understanding convergence of optimization algorithms
It has been known that GD/SGD/Adam/... converges to local minima for various classes of objective functions under exact arithmetic. How can we extend the notion of convergence to a discrete parameter space? Do optimization algorithms under machine arithmetic still "converge" to local optima under a reasonable number of time steps?
Understanding the generalization properties of optimization algorithms
Why do neural networks generalize to unseen data?
Network quantization
To reduce the computational/space complexity of neural networks, we want to learn networks using low-precision parameters. In this case, the discrepancy between existing theory (based on exact arithmetic) and real-world implementations (low-precision) becomes more significant, but there is no principled (and efficient) learning algorithm.
If you are interested in joining our group, send me an email with your transcript and CV.
[C23] Minimum Width for Universal Approximation using Squashable Activation Functions
Jonghyun Shin, Namjun Kim, Geonho Hwang, Sejun Park
International Conference on Machine Learning (ICML), 2025
[C22] Floating-Point Neural Networks Can Represent Almost All Floating-Point Functions
Geonho Hwang, Yeachan Park, Wonyeol Lee, Sejun Park
International Conference on Machine Learning (ICML), 2025
[C21] Floating-Point Neural Networks Are Provably Robust Universal Approximators
Geonho Hwang, Wonyeol Lee, Yeachan Park, Sejun Park^, Feras Saad
International Conference on Computer Aided Verification (CAV), 2025
[C20] What does automatic differentiation compute for neural networks?
Sejun Park*, Sanghyuk Chun*, Wonyeol Lee (*=equal contribution)
International Conference on Learning Representations (ICLR), 2024 (spotlight presentation)
[C19] Minimum width for universal approximation using ReLU networks on compact domain (pdf)
Namjun Kim, Chanho Min, Sejun Park
International Conference on Learning Representations (ICLR), 2024
[C18] On the Correctness of Automatic Differentiation for Neural Networks with Machine-Representable Parameters (pdf)
Wonyeol Lee^, Sejun Park^, Alex Aiken (^=corresponding authors)
International Conference on Machine Learning (ICML), 2023
[C17/W] Neural Networks Efficiently Learn Low-Dimensional Representations with SGD (pdf)
Alireza Mousavi-Hosseini, Sejun Park, Manuela Girotti, Ioannis Mitliagkas, Murat A. Erdogdu
International Conference on Learning Representations (ICLR), 2023 (spotlight presentation)
NeurIPS workshop on Optimization for Machine Learning, 2022
[C16/W] Guiding Energy-based Models via Contrastive Latent Variables (pdf)
Hankook Lee, Jongheon Jeong, Sejun Park, and Jinwoo Shin
International Conference on Learning Representations (ICLR), 2023 (spotlight presentation)
NeurIPS workshop on Self-Supervised Learning: Theory and Practice, 2022 (oral presentation)
[C15] Generalization Bounds for Stochastic Gradient Descent via Localized ε-Covers (pdf)
Sejun Park, Umut Simsekli, and Murat A. Erdogdu
Conference on Neural Information Processing Systems (NeurIPS), 2022
[C14/W] SmoothMix: Training Confidence-calibrated Smoothed Classifiers for Certified Adversarial Robustness (pdf)
Jongheon Jeong, Sejun Park, Minkyu Kim, Heung-Chang Lee, Doguk Kim, and Jinwoo Shin
Conference on Neural Information Processing Systems (NeurIPS), 2021
ICML workshop on A Blessing in Disguise: The Prospects and Perils of Adversarial Machine Learning, 2021
[C13] Provable Memorization via Deep Neural Networks using Sub-linear Parameters (pdf)
Sejun Park, Jaeho Lee, Chulhee Yun, and Jinwoo Shin
Conference on Learning Theory (COLT), 2021
Part of this work was presented at Conference on the Mathematical Theory of Deep Neural Networks (DEEPMATH), 2020 (contributed talk)
[C12] Layer-adaptive Sparsity for the Magnitude-based Pruning (pdf)
Jaeho Lee, Sejun Park, Sangwoo Mo, Sungsoo Ahn, and Jinwoo Shin
International Conference on Learning Representations (ICLR), 2021
[C11] Minimum Width for Universal Approximation (pdf)
Sejun Park, Chulhee Yun, Jaeho Lee, and Jinwoo Shin
International Conference on Learning Representations (ICLR), 2021 (spotlight presentation)
Part of this work was presented at Conference on the Mathematical Theory of Deep Neural Networks (DEEPMATH), 2020 (contributed talk)
[C10] Distribution Aligning Refinery of Pseudo-label for Imbalanced Semi-supervised Learning (pdf)
Jaehyung Kim, Youngbum Hur, Sejun Park, Eunho Yang, Sung Ju Hwang, and Jinwoo Shin
Conference on Neural Information Processing Systems (NeurIPS), 2020
[C9] Learning Bounds for Risk-sensitive Learning (pdf)
Jaeho Lee, Sejun Park, and Jinwoo Shin
Conference on Neural Information Processing Systems (NeurIPS), 2020
[C8] Lookahead: A Far-sighted Alternative of Magnitude-based Pruning (pdf)
Sejun Park*, Jaeho Lee*, Sangwoo Mo, and Jinwoo Shin (*=equal contribution)
International Conference on Learning Representations (ICLR), 2020
[C7] Spectral Approximate Inference (pdf, poster, slide)
Sejun Park, Eunho Yang, Se-Young Yun, and Jinwoo Shin
International Conference on Machine Learning (ICML), 2019
[C6] Learning in Power Distribution Grids under Correlated Injections (pdf)
Sejun Park, Deepjyoti Deka, and Michael Chertkov
Asilomar Conference on Signals, Systems and Computers (ACSSC), 2018
[C5] Exact Topology and Parameter Estimation in Distribution Grids with Minimal Observability (pdf)
Sejun Park, Deepjyoti Deka, and Michael Chertkov
Power Systems Computation Conference (PSCC), 2018
[C4] Rapid Mixing Swendsen-Wang Sampler for Stochastic Partitioned Attractive Models (pdf, poster)
Sejun Park, Yunhun Jang, Andreas Galanis, Jinwoo Shin, Daniel Stefankovic, and Eric Vigoda
International Conference on Artificial Intelligence and Statistics (AISTATS), 2017
[C3] Practical message-passing framework for large-scale combinatorial optimization (pdf)
Inho Cho, Soya Park, Sejun Park, Dongsu Han, and Jinwoo Shin
IEEE International Conference on Big Data, 2015
[C2] Minimum Weight Perfect Matching via Blossom Belief Propagation (pdf)
Sungsoo Ahn, Sejun Park, Michael Chertkov, and Jinwoo Shin
Conference on Neural Information Processing Systems (NIPS), 2015 (spotlight presentation)
[C1] Max-Product Belief Propagation for Linear Programming: Applications to Combinatorial Optimization (pdf)
Sejun Park and Jinwoo Shin
Conference on Uncertainty in Artificial Intelligence (UAI), 2015
[J4] Expressive Power of ReLU and Step Networks under Floating-Point Operations (pdf)
Yeachan Park*, Geonho Hwang*, Wonyeol Lee, Sejun Park (*=equal contribution)
Neural Networks, 2024
[J3] Learning with End-Users in Distribution Grids: Topology and Parameter Estimation (pdf)
Sejun Park, Deepjyoti Deka, Scott Backhaus, and Michael Chertkov
IEEE Transactions on Control of Network Systems, 2020
[J2] Maximum Weight Matching using Odd-sized Cycles: Max-Product Belief Propagation and Half-Integrality (pdf)
Sungsoo Ahn*, Michael Chertkov*, Andrew E. Gelfand*, Sejun Park*, and Jinwoo Shin* (*=alphabetical order)
IEEE Transactions on Information Theory, 2018
[J1] Convergence and Correctness of Max-Product Belief Propagation for Linear Programming (pdf)
Sejun Park and Jinwoo Shin
SIAM Journal on Discrete Mathematics, 2017
Email: sejun.park000@gmail.com