Foundation model accuracy test
start of training (1/3)
outputed scores per frame
acc: 73%
mid of training (2/3)
outputed scores per frame
acc: 70%
end of training (3/3)
outputed scores per frame
acc: 96%
Averaged reward over the whole episode during the training. They all increase, indicating the RL optimization part is doing well.Â