Repairing Fairness of Neural Networks via Neuron Condition Synthesis
Repairing Fairness of Neural Networks via Neuron Condition Synthesis
Anonymous
Deep Neuron Networks (DNNs) have achieved tremendous success in many applications. However, it has been demonstrated that DNNs can exhibit some undesirable behaviors on such as robustness, privacy and other trustworthiness issues. Among them, fairness (eg, non-discrimination) is one important property especially when they are applied in some sensitive applications (eg, finance and employment).
In this work, we propose a method to effectively and efficiently repair the fairness issues of DNNs, which does not need additional data (eg, discriminatory instances). Our basic idea is inspired from the traditional program repair method that synthesises the proper condition checking. To repair traditional program, a typical method is to localize the program defects and repair the program logic by adding condition checking. Similarly, for DNNs, we try to understand the unfair logic and reformulate it with well-designed condition checking. In this paper, we synthesise the condition that can filter the features of the protected attributes in the DNN. Specifically, we first perform the neuron-based analysis and check the functionalities of neurons. Then a new condition layer is added after each hidden layer to penalize neurons that are critical to the protected features(ie, features relevant to protected attributes) and promote neurons that are critical to the non-protected features(ie, features relevant to original tasks).