Improved algorithms for linear stochastic bandits (2011)
Thompson sampling for contextual bandits with linear payoffs (2013)
Linear Thompson sampling revisited (2016)
Fixed size confidence ellipsoids for linear regression parameters (1966)
Using upper confidence bounds for online learning (2000)
Using confidence bounds for exploitation-exploration trade-offs (2002)
LASSO Bandits
Online decision making with high-dimensional covariates (2020)
Stochastic Bandits with ReLU Neural Networks (2024)