B. Settles. Active Learning, 2012 (Comprehensive introduction to active learning fundamental).
T. Lattimore and C. Szepesvári. Bandit Algorithms, 2020 (Textbook on multi-armed bandits and theory of exploration-exploitation).
R. Sutton and A. Barto. Reinforcement Learning: An Introduction (2nd Ed.), 2018 (RL textbook covering MDPs, exploration, and reward learning).
D. Li et al., A Survey on Deep Active Learning: Recent Advances and New Frontiers, TNNLS 2024 (Overview of deep AL methods, including query strategies and use of representations).
Scalable Deep Active Learning
Batch Active Learning at Scale, Citovsky et al., NeurIPS 2021
Stochastic batch acquisition for deep active learning -- Kirsch et al., 2021
Learning Loss for Active Learning – Yoo & Kweon, 2019
A survey on active deep learning: from model driven to data driven -- Liu et al., CSUR 2022
Label-Efficient Deep Active Learning
A Neural Pre-Conditioning Active Learning Algorithm to Reduce Label Complexity” -- Kong et al., NeurIPS 2022
On Statistical Bias In Active Learning: How and When To Fix it -- Farquhar et al., ICLR 2021
Active Learning on a Budget: Opposite Strategies Suit High and Low Budgets -- Hacohen et al., ICML 2022
Active testing: Sample-efficient model evaluation -- Kossen et al., ICML 2021
Active Learning with Data Augmentation
LADA: Look-Ahead Data Acquisition via Augmentation for Deep Active Learning -- Kim et al., NeurIPS 2021
When Active Learning Meets Implicit Semantic Data Augmentation -- Chen et al., ECCV 2022
Towards Controlled Data Augmentations for Active Learning -- Yang et al., ICML 2023
Bayesian Generative Active Deep Learning -- Tran et al., ICML 2019
Query Synthesis and Generative Models for AL
Generative adversarial active learning -- Zhuo & Bento, 2017
Adversarial Sampling for Active Learning -- Mayer & Timofte, WACV 2020
Dual Generative Adversarial Active Learning -- Guo et al., Applied Intelligence 2021
Generative active learning for image synthesis personalization -- Zhang et al., MM'2024
Neural Contextual Bandits and Efficient Exploration
Learning Neural Contextual Bandits through Perturbed Rewards” (Jia et al., ICLR 2022)
Deep Bandits Show-Off – Zhu et al., NeurIPS 2021
NeuralUCB & NeuralTS – Zhou et al. 2020; Zhang et al. 2021
Provably Efficient Neural Bandits – Salgia et al., ICML 2023.
Representations learning and Deep Kernels in Bandits
Scalable Representation Learning in Linear Contextual Bandits with Constant Regret Guarantees -- Tirinzoni et al., NeurIPS 2022
PFNs4BO: In-Context Learning for Bayesian Optimization -- Muller et al., ICML 2023
Efficient Bayesian Optimization with Deep Kernel Learning and Transformer Pre-trained on Multiple Heterogeneous Datasets -- Lyu et al., ICLR 2023
Neural Diffusion Processes -- Dutordoir et al., ICML 2023
Active Causual Representation Learning & RL
Amortized Active Causal Induction with Deep Reinforcement Learning -- Annadani et al., NeurIPS 2024
Causal Curiosity: RL Agents Discovering Self-supervised Experiments for Causal Representation Learning -- Sontakke et al., ICML 2021
BECAUSE: Bilinear Causal Representation for Generalizable Offline Model-based Reinforcement Learning -- Lin et al., NeurIPS 2024
Towards Generalizable Reinforcement Learning via Causality-Guided Self-Adaptive Representations -- Yang et al., ICLR 2025
Meta-Exploration
MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration -- Zhang et al., ICML 2021
Hindsight Experience Replay, Andrychowicz et al., NeurIPS 2017
Curiosity-driven Exploration by Self-supervised Prediction -- Pathak et al., 2017
Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables -- Rakelly et al., ICML 2019
Curriculum Learning and Automatic Goal Generation
Curriculum for Reinforcement Learning, Lil'Log, https://lilianweng.github.io/posts/2020-01-29-curriculum-rl/
Prioritized Level Replay -- Jiang et al., ICML 2021
Automatic Goal Generation for Reinforcement Learning Agents Carlos Florensa (Goal GAN) -- Florensa et al., ICML 2018
Emergent Complexity and Zero-shot Transfer via Unsupervised Environment Design -- Dennis et al., NeurIPS 2020
Active Preference Learning for LLM Alignment
Deep Bayesian Active Learning for Preference Modeling in Large Language Models -- Melo et al., NeurIPS 2024
Active Preference Learning for LLMs – Muldrew et al., ICML 2024
Self-Exploring Language Models: Active Preference Elicitation for Online Alignment -- Zhang et al., TMLR 2025
Pairwise Proximal Policy Optimization: Language Model Alignment with Comparative RL -- Wu et al., COLM 2024.
Active In-Context Learning
Active Learning Principles for In-Context Learning with LLMs -- Margatina et al., EMNLP 2023
Which Examples to Annotate for ICL? -- He et al., 2024.
Designing Informative Metrics for Few-Shot Example Selection -- Yuan et al., 2023
Learning To Retrieve Prompts for In-Context Learning -- Rubin et al., 2022