Learning Reward Functions from Scale Feedback