Breaking the Performance Ceiling in Complex Reinforcement Learning requires Inference Strategies