In July 2024, Ramlah Sara Rehman successfully defended her MSc thesis with the title „Training Humans for Synthetic Face Image Detection“.
Abstract:
Advancements in generative models for image synthesis have revolutionised the field of computer vision and artificial intelligence. In recent years, generative models have become capable of producing hyperrealistic synthetic images. These synthetic images can be used for nefarious purposes and the prevalence of these images in digital media can have large scale negative implications for individuals and governments. The increased frequency of synthetic content appearing in digital media raises the concern of whether authentic content can be distinguished from the synthetic.
This thesis addresses human detection capabilities in distinguishing authentic face images from the synthetic. To test these capabilities, two perceptual experiments are designed and carried out based on principles from experimental psychology and optimal experimental design. Participants are randomly assigned to either the experimental group or the control group. The experimental group receives training halfway through the experiment, while the control group instead receives a coffee break. In one trial of the experiment, participants are presented with a face image, and are then tasked with classifying it as either ’Real’ or ’Synthetic’. Each experiment consists of 32 trials, where face images are presented to participants in a random order. Participants in the experimental group are provided with a training session. This training session is developed based on face perception theories and professional face identification training material. The training session consists of systematic analysis providing participants with a visual face identification strategy. It also includes providing participants with example images where visual artefacts, a consequence of the generation process, appear in synthetic images. Results from the perceptual experiments depict that the experimental group has an improved accuracy of 3.6% after training, compared to 0.2% in the control group. This proves that training resulted in slightly improved accuracy scores, but statistical analysis shows that there is no statistically significant improvement. A discussion of the results explores the limitations of human perception and detection capabilities, particularly in reference to synthetic faces. Future work highlights the continued importance of determining human capability to detect synthetic content in different contexts.