REVEAL 2020: Bandit and Reinforcement Learning from User Interactions