PARS ICML25 Rebuttal

(Additional response)

[Fig C] (a) True Q at a fixed state; (b) Learned policy obtained using a highly expressive policy

[Table E] Performance improvement after applying PA and RS-LN to IQL. The scores are the averages of the final evaluations across five random seeds. (same as Table 5 in our manuscript)

Page updated

Google Sites

Report abuse