(*equal contribution)
International Conference Proceedings (refereed)
Haruka Kiyohara, Daniel Yiming Cao, Yuta Saito, Thorsten Joachims.
An Off-Policy Learning Approach for Steering Sentence Generation towards Personalization.
ACM Conference on Recommender Systems (RecSys), 2025. (acceptance rate=19%)
[paper] [arXiv] [slides] [code]
Haruka Kiyohara, Fan Yao, Sarah Dean.
Policy Design for Two-sided Platforms with Participation Dynamics.
International Conference on Machine Learning (ICML), 2025. (acceptance rate=26.9%)
[arXiv] [slides] [code] [AIhub interview]
Tatsuhiro Shimizu, Koichi Tanaka, Ren Kishimoto, Haruka Kiyohara, Masahiro Nomura, Yuta Saito.
Effective Off-Policy Evaluation and Learning in Contextual Combinatorial Bandits.
ACM Recommender Systems Conference (RecSys), 2024.
[arXiv]
Haruka Kiyohara, Ren Kishimoto, Kosuke Kawakami, Ken Kobayashi, Kazuhide Nakata, Yuta Saito.
Towards Assessing and Benchmarking Risk-Return Tradeoff of Off-Policy Evaluation.
International Conference on Learning Representations (ICLR), 2024. (acceptance rate=31%)
[arXiv] [slides] [software] [TokyoTech news]
Haruka Kiyohara, Masahiro Nomura, Yuta Saito.
Off-Policy Evaluation of Slate Bandit Policies via Optimizing Abstraction.
ACM Web Conference (WebConf), 2024. (acceptance rate=20.2%)
[arXiv] [code] [slides]
Masatoshi Uehara*, Haruka Kiyohara*, Andrew Bennett, Victor Chernozhukov, Nan Jiang, Nathan Kallus, Chengchun Shi, Wen Sun.
Future-Dependent Value-Based Off-Policy Evaluation in POMDPs.
Conference on Neural Information Processing Systems (NeurIPS), 2023. (acceptance rate=26.1%) (Spotlight)
[arXiv] [code]
Haruka Kiyohara, Masatoshi Uehara, Yusuke Narita, Nobuyuki Shimizu, Yasuo Yamamoto, Yuta Saito.
Off-Policy Evaluation of Ranking Policies under Diverse User Behavior.
ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2023. (acceptance rate=22.1%)
[arXiv] [code] [slides]
Takuma Udagawa, Haruka Kiyohara, Yusuke Narita, Yuta Saito, Kei Tateno.
Policy-Adaptive Estimator Selection for Off-Policy Evaluation.
AAAI Conference on Artificial Intelligence (AAAI), 2023. (acceptance rate=19.6%) (Oral Presentation)
[paper] [arXiv] [code] [slides]
Haruka Kiyohara, Yuta Saito, Tatsuya Matsuhiro, Yusuke Narita, Nobuyuki Shimizu, Yasuo Yamamoto.
Doubly Robust Off-Policy Evaluation for Ranking Policies under the Cascade Behavior Model.
ACM International Conference on Web Search and Data Mining (WSDM), 2022. (acceptance rate=20.2%) (Best Paper Award Runner-Up)
[paper] [arXiv] [code] [slides]
Yuta Saito*, Takuma Udagawa*, Haruka Kiyohara, Kazuki Mogi, Yusuke Narita, Kei Tateno.
Evaluating the Robustness of Off-Policy Evaluation.
ACM Conference on Recommender Systems (RecSys), 2021. (acceptance rate=18.4%)
[paper] [arXiv] [package] [slides]
Akihisa Watanabe, Michiya Kuramata, Kaito Majima, Haruka Kiyohara, Kensho Kondo, Kazuhide Nakata.
Constrained Generalized Additive 2 Model With Consideration of High-Order Interactions.
IEEE International Conference on Electrical, Computer and Energy Technologies (ICECET), 2021.
[paper] [arXiv] [package]
Workshop Papers (refereed)
Haruka Kiyohara, Rayhan Khanna, Thorsten Joachims.
Off-Policy Learning for Diversity-aware Candidate Retrieval in Two-stage Decisions.
RecSys Workshop on Causality, Counterfactuals & Sequential Decision-Making (CONSEQUENCES), 2025. (Oral)
ICML Workshop on Scaling up Intervention Models (SIM), 2025.
Ren Kishimoto, Koichi Tanaka, Haruka Kiyohara, Yusuke Narita, Nobuyuki Shimizu, Yasuo Yamamoto, Yuta Saito.
Efficient Offline Learning of Ranking Policies via Top-k Policy Decomposition.
ICML 2024 Workshop on Aligning Reinforcement Learning Experimentalists and Theorists (ARLET), 2024.
Haruka Kiyohara, Yusuke Narita, Yuta Saito, Kei Tateno, Takuma Udagawa.
Safe and Deployment Efficient Policy Learning for Exploring Novel Actions in Recommender Systems.
RecSys 2022 Workshop on Causality, Counterfactuals, Sequential Decision-Making & Reinforcement Learning (CONSEQUENCES+REVEAL), 2022.
[arXiv]
Haruka Kiyohara, Kosuke Kawakami, Yuta Saito.
Accelerating Offline Reinforcement Learning Application in Real-Time Bidding and Recommendation: Potential Use of Simulation.
RecSys 2021 Workshop on Simulation Methods for Recommender Systems (SimuRec), 2021. (Position paper)
[arXiv]
Preprint
Haruka Kiyohara, Ren Kishimoto, Kosuke Kawakami, Ken Kobayashi, Kazuhide Nakata, Yuta Saito.
SCOPE-RL: A Python Library for Offline Reinforcement Learning and Off-Policy Evaluation.
arXiv preprint, 2023.
[arXiv] [slides] [software]
Domestic Conference (Japanese, refereed)
岩田真奈, 桑原淳, 石塚湖太, 倉又迪哉, 清原明加, 中田和秀.
タクシーの流し営業における強化学習を用いたルート推薦.
オペレーションズ・リサーチ, 2021.