Research

Below, you’ll find my writing organized by publication type and year.

Job Market Paper

"Personalized AI Alignment Revisited: the Fundamental Importance of User Diversity", by Enoch H. Kang, 2026

> This paper theoretically characterizes personalized AI alignment and discusses how this redefines user value in modern AI-driven platforms.

Working Papers
(Chronological order)

"Empirical risk minimization for Inverse RL and Dynamic Discrete Choice models" by Enoch H. Kang*, Hema Yoganarasimhan and Lalit Jain, 2025 (minor revision, Operations Research) (Github)

> This paper solves an open problem of scalably estimating a dynamic discrete choice model with provable guarantees using gradient-based methods.

Accepted in the main track, ACM Economics and Computation (EC) 2025,
Invited tutorial talk (with John Rust), Econometric Society Summer School in Dynamic Structural Econometrics 2025 (YouTube Link) (Slides)
The 2025 World Congress of the Econometric Society (ESWC 2025)
UBC econometric lunch seminar

"TextBO: Bayesian Optimization in Language Space for Eval-Efficient Self-Improving AI", by Enoch H. Kang* and Hema Yoganarasimhan, 2025 (major revision, Management Science) (Slides, Github, App, Demonstration for MBA class)

> In social science, evaluation is expensive, as it involves humans. We propose eval-optimal self-improving AI and apply it for ad optimization.

Stanford AI & Marketing: New Methods and New Risks Conference
Columbia AI/ML conference
Frank M. Bass UTD-FORMS Conference
Symposium on Artificial Intelligence in Marketing
ICLR 2026 workshop (AI with Recursive Self-Improvement)

"LLM Personas as a Substitute for Field Experiments in Method Benchmarking", by Enoch H. Kang, 2025

> This paper proves that swapping humans for personas is indistinguishable from changing the evaluation population (e.g., New York to Jakarta), justifying the use of persona simulation for method benchmarking. It also derives the necessary persona dataset size for it to work as a valid benchmark.

ICML 2026 CTB (Combining Theory and Benchmarks) workshop

"Stability and generalization for Bellman residuals", Enoch H. Kang and Kyoungseok Jang, 2025

> This paper proves the first O(1/n) statistical convergence guarantee for gradient-based methods for Offline RL/IRL/Dynamic discrete choice models.

ICML 2026 Workshop (Decision-Making from Offline Datasets to Online Adaptation: Black-Box Optimization to Reinforcement Learning)

"Fast Globally Convergent Gradient-based Offline Reinforcement Learning" by Byungjun Park, Minhyeok Park, Kyoungseok Jang*, Enoch H. Kang*, 2026

> This paper solves an open problem of scalable & stable gradient-based offline reinforcement learning with theoretical optimality guarantees.

ICML 2026 Workshop (Decision-Making from Offline Datasets to Online Adaptation: Black-Box Optimization to Reinforcement Learning)

"Reasonably Reasoning AI Agents Avoid Game-Theoretic Failures in Zero-Shot, Provably" by Enoch H. Kang, 2026

> This paper proves that we don't need unrealistic unified post-training of AI agents to guarantee or predict their convergence to the Nash equilibrium.

ICLR 2026 workshop (Multi-Agent Learning and Its Opportunities in the Era of Generative AI)

"Demystifying the Unreasonable Effectiveness of Online Alignment Methods" by Enoch H. Kang, 2026

> This paper proves that the effective regret rate of online alignment methods is O(1), not O(log T).

"Test and Self-Improve for Markov Sufficiency: An AI-driven Dynamic Measurement Framework for Long-Term Outcomes", working manuscript (Ongoing collaboration with Adobe)

"Multi-Turn Agentic Alignment Revisited", working manuscript

"Debiasing causal machine learning via causal representations in language models", working manuscript

Published Papers

"Is O (log N) practical? Near-Equivalence Between Delay Robustness and Bounded Regret in Bandits and RL", Enoch H. Kang* and P. R. Kumar, NeurIPS 2024,

Establishes the importance of customer diversity in dealing with experiments with reward delays
NeurIPS 2023 Adaptive Experimentation and Active Learning in the Real World (ReALML) workshop

"Bounded (O(1)) Regret Recommendation Learning via Synthetic Controls Oracle", Enoch H Kang, P. R. Kumar, Allerton 2023

ICML 2022 Adaptive Experimentation and Active Learning in the Real World (ReALML) workshop
RecSys 2022 Causality, Counterfactuals, Sequential Decision-Making & Reinforcement Learning workshop (Selected as the Long Oral presentation)

"Learning NP-Hard Multi-Agent Assignment Planning using GNN: Inference on a Random Graph and Provable Auction-Fitted Q-learning", Enoch H. Kang, Taehwan Kwon, James R. Morrison, and Jinkyoo Park, NeurIPS 2022

ICML GRL+: novel applications, best paper runner up (Oral presentation link)

Lecture notes

A Lecture Note on RL and IRL:

Part I: Foundations of Offline Reinforcement Learning

Part II: Foundations of Inverse Reinforcement Learning and Dynamic Discrete Choice Models

Tenets of my research

I believe there is a strong parallel between the mental models used in academic research and those in venture investment. I tend to focus on problems that resemble the 'pre-seed' or 'seed' stage, and I draw significant inspiration from Mike Maples Jr.’s frameworks for seed investing.

Also, my approach to inquiry is deeply rooted in the philosophy of Richard Hamming’s 'You and Your Research' - a commitment to tackling significant problems and striving for excellence.

"... One of the characteristics of successful scientists is having courage. Once you get your courage up and believe that you can do important problems, then you can. If you think you can't, almost surely you are not going to."

Page updated

Google Sites

Report abuse