Reinforcement Learning without Ground-truth State