We have conducted new experiments and report overall performance and PD w.r.t. NDCG@50 and F1@50 of all the compared methods based on the Matrix Factorization (MF) backbone recommendation model in Task-R on Movielenz dataset. The results show similar trend to those w.r.t. NDCG@20 and F1@20 (reported in the left-most figure in Figure 2): FADE leads to a substantial reduction in PD with a modest impact on overall performance.
In lines 594-596, we mentioned that our fairness-aware competitors, Adver and Re-rank, are implemented with a fine-tuning strategy for a fair comparison, even though they were originally not designed for dynamic scenarios. To better answer reviewer’s question, we provide a detailed report their performance and PD w.r.t. NDCG@20 and F1@20 at each time period over time, using the same setting as Fig (MF) backbone recommendation model and tested in Task-R on Movielenz dataset. The results show that FADE consistently outperforms other methods in terms of reducing PD at nearly every time period. In terms of overall performance, FADE is slightly more effective or comparable to Adver and significantly outperforms Re-rank.