RLxF: Reinforcement Learning from World Feedback

@ ICML 2026