Nadav Merlis
Postdoctoral Fellow @ CREST
ENSAE Paris (École nationale de la statistique et de l'administration économique)
Supervised by Prof. Vianney Perchet
About me
I am a postdoctoral fellow at CREST, ENSAE, working with Prof. Vianney Perchet. My research focuses on Multi-Armed Bandit problems and other theoretical aspects in Reinforcement Learning. I completed my Ph.D. in 2022 at the RL^2 lab at the Technion, supervised by Prof. Shie Mannor. Before that, I completed my B.Sc. (summa cum laude) in the Electrical Engineering Department at the Technion.
Publications
Multi-Armed Bandits with Guaranteed Revenue per Arm
Nadav Merlis*, Hugo Richard*, Flore Sentenac*, Corentin Odic, Mathieu Molina, Vianney Perchet
AISTATS, 2024 [paper]
On Preemption and Learning in Stochastic Scheduling
Dorian Baudry, Nadav Merlis, Mathieu Molina, Hugo Richard, Vianney Perchet
ICML, 2023 [paper]
Reinforcement Learning with History-Dependent Dynamic Contexts
Guy Tennenholtz*, Nadav Merlis*, Lior Shani, Martin Mladenov, Craig Boutilier
ICML, 2023 [paper]
Reinforcement Learning with a Terminator
Guy Tennenholtz, Nadav Merlis, Lior Shani, Shie Mannor, Uri Shalit, Gal Chechik, Assaf Hallak, Gal Dalal
NeurIPS, 2022 [paper]
Query-Reward Tradoffs in Multi-Armed Bandits
Nadav Merlis, Yonathan Efroni, and Shie Mannor
RLDM, 2022 [paper]
Confidence-Budget Matching for Sequential Budgeted Learning
Yonathan Efroni*, Nadav Merlis*, Aadirupa Saha, and Shie Mannor
ICML 2021 [paper]
Ensemble Bootstrapping for Q-Learning
Oren Peer, Chen Tessler, Nadav Merlis, and Ron Meir
ICML 2021 [paper]
Lenient Regret for Multi-Armed Bandits
Nadav Merlis and Shie Mannor
AAAI 2021 [paper]
Reinforcement Learning with Trajectory Feedback
Yonathan Efroni*, Nadav Merlis*, and Shie Mannor
AAAI 2021 [paper]
Tight Lower Bounds for Combinatorial Multi-Armed Bandits
Nadav Merlis Shie Mannor
COLT 2020 [paper]
Tight Regret Bounds for Model-Based Reinforcement Learning with Greedy Policies
Yonathan Efroni*, Nadav Merlis*, Mohammad Ghavamzadeh, and Shie Mannor
NeurIPS 2019 [paper]
Batch-Size Independent Regret Bounds for the Combinatorial Multi-Armed Bandit Problem
Nadav Merlis and Shie Mannor
COLT 2019 [paper]
Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning
Tom Zahavy*, Matan Haroush*, Nadav Merlis*, Daniel J. Mankowitz, and Shie Mannor
NeurIPS 2018 [paper]
Contact me at nadav \dot merlis \at ensae \dot fr