Yannik Schäfer successfully defended his Master Thesis on Improving Demographic Fairness for Biometric Face Recognition Systems
With the rise of deep neural networks, the performance of biometric systems has increased tremendously. Biometric systems for facial recognition are now used in everyday life. The systems are used among other things for entry control at borders, crime prevention, or private device access control. Although the accuracy of these systems is generally high, they are not without flaws. Biometric systems in many cases have a demographic bias. Different demographic groups are therefore not recognized equally well. This is especially true for facial recognition, among other things, due to the demographic features of gender and skin color being clearly visible in images of human faces.
This thesis investigates if well-chosen model combinations for decision- and/or score-level fusions can improve the fairness of the fused models in the verification scenario. For this purpose, twelve different models of four different face recognition algorithms were evaluated. The baseline parameter for all models is an FMR of 0.1%. The models used for fusion were determined based on three selection criteria: the models with the lowest False-Match-Rate FMR for an individual demographic group, the three fairest models for a covariate, and the pareto-efficient models for FMR-Fairness and False-Non-Match-Rate FNMR. The fusions were evaluated by FMR-Fairness, FNMR, and individual group-specific FMR. It was found that it is possible to improve the fairness between specific demographic groups in single cases and make the fusion model fairer than the initial models. In twelve out of 33 fusions, the fairness of the initial models was improved by a fusion. Different types of mergers have different influences on performance parameters. A general statement that fusion can improve the fairness and/or accuracy of biometric systems cannot be made. But some trends are recognizable: The best selection criterion was the lowest FMR for an individual demographic group. The fusions were most successful in improving fairness between gender and skin color but did only in two cases improve the fairness between subgroups of those. The fusions with only two models were always fairer than the initial models, this was not the case with fusions of three models. The OR-fusion was the only fusion that alleviated the bias between subgroups.