Assistant Professor of  Technology, Operations and Statistics (TOPS) at NYU Stern School of Business 

Feb 2024: Our paper on the Bayesian best arm identification (with Kyoungseok Jang and Kazutoshi Yamazaki) is available on arXiv. 

Jan 2024: Our paper on the cost of replicability on bandits (with Shinji Ito, Yuichi Yoshida, Souta Koshino) is available on arXiv. If you adopt a batched bandit algorithm, the cost for replicability is o(log T) and comes for free with large T!

Dec 2023: Our postdoc Kyoungseok Jang's new homepage is available right now! He is strong on sparse linear bandits, bilinear bandits, and coin-betting.

Nov 2023: Our paper on bandit fair division (with Hakuei Yamada, Kenshi Abe, Atsushi Iwasaki) is available on arXiv. -> to appear in AISTATS 2024.

Oct 2023: We revised the nonstationary bandit paper [arXiv] with more readable proof, regret bound for ADWIN+Thompson sampling. This is aimed to provide a theoretical basis for our KDD2019 paper. -> Published in JMLR.

Oct 2023: My slides at WDS (INFORMS WS) is here [slides].

Sep 2023: Our high-dimensional bandit paper (with Masaaki Imaizumi) [slides][poster] was accepted in NeurIPS 2023. It is about explore-then-commit (EtC) policy in a high-dimensional setting, where the eigenvalues of the covariates decay faster than $1/k$.

Jul 2023: Our budgeted KL-UCB paper (published in 2017) is now open access. If you are interested in a bandit problem where rewards, as well as costs, are stochastic, it may worth reading. 

Jan 2023: We are publishing two papers to AISTATS 2023. 

Oct 2022: Our recent paper Strategic Choices of Migrants and Smugglers in the Central Mediterranean Sea" is on arXiv. It is about the historical change on Mediterranean Sea Migrants and its analysis. 

Oct 2022: Our recent paper "Bridging Offline and Online Experimentation: Constraint Active Search for Deployed Performance Optimization" is published in Transactions on Machine Learning Research (TMLR) . We propose the improvement of Bayesian optimization in the case where offline and online evaluations do not perfectly match. 

June 2022: Our recent manuscript "Minimax Optimal Algorithms for Fixed-Budget Best Arm Identification" is here. -> published in NeurIPS2022. Slides.

Feb 2022: Our recent manuscript "Anytime Capacity Expansion in Medical Residency Match by Monte Carlo Tree Search" is here -> published in IJCAI2022. Unlike fixed-capacity matching, capacity expansion in two-sided matching is hard-to-compute. In this paper, we propose a Monte Carlo tree search-based method for this problem that exploits the structure. We demonstrate the effectiveness of the proposed method with the Japan Residency Matching Program (matching between medical students and hospitals) dataset. 

Feb 2022: Our recent manuscript "Bayes Optimal Algorithm is Suboptimal in Frequentist Best Arm Identification" is here [slides]. While the Bayes optimal algorithm is described in terms of a recursive equation that is virtually impossible to compute exactly, we pave a way to analyze the algorithm. Unlike the static world where frequentist and Bayesian meets in a long run, these two in best arm identification are different even in a long run. (Oct 2023, R&R in Machine Leaning)

Nov 2021: Our recent manuscript "Optimal Simple Regret in Bayesian Best Arm Identification" is here [slides]. The paper is about the Bayesian performance bound on the best arm identification, which is a counterpart of the corresponding bound by Lai 1987 for the K-armed bandit problem. (2023 Jul, to appear in MOR 😀)

Oct 2021: Our recent manuscript "Policy Choice and Best Arm Identification" regarding the best arm identification paper is here. The paper discusses the subtly regarding the posterior convergence, probability of error, and ordinal optimization. To put it in another way, there is a difference in what a Bayesian BAI learner thinks and a frequentist learner thinks.

Sep 2021: We are preparing our new manuscript "Deviation-based Learning" (with Shunya Noda) is here [slides (new)] [slides (old)].  Inspired by a GPS navigation example, we consider a recommender system that learns not from the user's rating but from the actions they take. In our stylized model, users are expertized on the local routes while the recommender system has its information advantage such as route congestion. We show that (1) a naive recommender system fails to learn from the users because all the users will follow the recommendation when the recommender system is moderately good. (2) By allowing an extended message space -- now the recommender can signal when the two routes have a similar payoff -- improves the learning process.

Jul 2021: Our recent manuscript with Edouard Fouché and Junya Honda is here (slides). The manuscript is about nonstationary bandit algorithm without forced exploration. The general nonstationary bandit algorithm must conduct O(sqrt(T)) exploration regardless of the type of algorithm. Instead, we are interested in the question that "to what extent we can do without forced exploration?" We characterize the class of global changes and proposed an extension of stationary bandit algorithms such as Thompson sampling to nonstationary bandit problems.  R&R in JMLR.

Feb 2021: Our recent manuscript with Masaya Abe, Kei Nakagawa, and Kenichiro McAlinn is here. This paper is about multiple testing with applications in financial portfolio design. The state-of-the-art method for validating many testing is Storey's method. However, when the hypotheses are correlated, Storey's method tends to have more false discoveries than its desired threshold. In this paper, we propose a bootstrap-based alternative to multiple testing. Assuming the availability of samples from the correlated null distribution, we provide more robust control of false discoveries. The method is verified in factor-based portfolio dataset (dataset is available, see the paper!)

Nov 2020: My presentation at INFORMS 2020 is here. A short introduction of bandit problem + new result extracted from our recent paper. I am motivated by the "greedy is optimal in contextual bandit" myth. The truth is mixed. On one hand greedy is optimal, on the other hand, the constant is exponentially growing on data variance. I discuss "start with UCB, then go greedy" strategy in the end of the slides.

Nov 2020: Our recent manuscript with Shunya Noda is here. This paper is about statistical discrimination that leads to a biased sampling of data: If some group is underestimated, then that leads to "perpetual underestimation" -- with no example, beliefs are never updated, and bias remains forever. We show this by multi-armed bandit framework.  Will be talking in AI4SG and AFCI. To appear in Managament Science 😀.

EMail: junpei [at] komiyama.info / junpei.komiyama [at] nyu.edu

twitter: junpeikomiyama (en), jkomiyama_ (jp), Slideshare, Stern Directory 

Professor Komiyama’s research interests lie in the machine learning methodology and its application to business processes. His interests include decision making models such as multi-armed bandit problem, designing experiments, analyzing algorithmic bias such as fairness in machine learning models,  and guaranteeing reproducibility of scientific findings. He is interested in rethinking how we define findings with data science methods. His work has been presented in machine learning and data mining conferences such as Neural Information Processing Systems (NIPS) and Knowledge Discovery and Data Mining (KDD).

So what is your interest?

I am (Junpei is),

My expertise is mainly in analyses of sampling and data bias.