[1] Abdolmaleki, Abbas, et al. "Maximum a Posteriori Policy Optimisation." International Conference on Learning Representations. 2018.
[2] Jaegle, Andrew, et al. "Perceiver: General perception with iterative attention." International conference on machine learning. PMLR, 2021.
[3] Chebotar, Yevgen, et al. "Q-transformer: Scalable offline reinforcement learning via autoregressive q-functions." Conference on Robot Learning. PMLR, 2023.
[4] Reed, Scott, et al. "A generalist agent." arXiv preprint arXiv:2205.06175 (2022).
[5] Lampe, Thomas, et al. "Mastering Stacking of Diverse Shapes with Large-Scale Iterative Reinforcement Learning on Real Robots." arXiv preprint arXiv:2312.11374 (2023).
[6] Bousmalis, Konstantinos, et al. "RoboCat: A Self-Improving Foundation Agent for Robotic Manipulation." arXiv preprint arXiv:2306.11706 (2023).
[7] Lee, Alex X., et al. "Beyond pick-and-place: Tackling robotic stacking of diverse shapes." Conference on Robot Learning. PMLR, 2022.
[8] https://www.nist.gov/el/intelligent-systems-division-73500/iros-2017robotic-grasping-and-manipulation-competition
[9] Hoffmann, Jordan, et al. "Training compute-optimal large language models." arXiv preprint arXiv:2203.15556 (2022).