Detailed Experimental Results: Number of Discriminatory Instances Found by Each Fairness Testing Method on the Quantized Versions of Repaired Models
This table presents more detailed experimental results, specifically the number of discriminatory instances found by each fairness testing method across different quantized versions of repaired models and attribute combinations. Here, GRFT-M_ours indicates the testing results of GRFT on the models repaired by Ours. To reduce the impact of randomness, we repeat each experiment five times and record the average value. The results show that both GRFT and ours outperform other baselines across all datasets and models. Quantization can reduce or amplify biases depending on the dataset, but the fairness performance of M_ours remains largely unaffected.