On March 20, 2025, Maciej Salwowski defended successfully defended his MSc thesis with the title „Efficient Vision Language Models for Digital and Physical Attack Detection„.
Abstract:
Recent research in biometric systems has made significant strides in addressing fraudulent practices. However, as detection methods improve, attack techniques become increasingly sophisticated. Attacks on face recognition systems can be broadly divided into physical and digital approaches. Traditionally, deep learning models have been the primary defense against such attacks. While these models perform exceptionally well in scenarios for which they have been trained for, they often struggle to adapt to different types of attacks or varying environmental conditions. These subsystems require substantial amounts of training data to achieve reliable performance, yet biometric data collection faces significant challenges including privacy concerns and the logistical difficulties of capturing diverse attack scenarios under controlled conditions. This work investigates the application of Vision Language Models (VLM) for detecting physical presentation attacks and digital morphing attacks in biometric systems. Focusing on open-source models under 8 billion parameters, the first systematic framework for quantitative evaluation of VLMs in security-critical scenarios through in-context learning techniques is established. The experiments demonstrate that VLMs achieve competitive performance in presentation attack detection task, outperforming some of the traditional CNNs without resource exhaustive training, and imply insightful trajectory for differential morphing attack detection challenge. The results validate VLMs as promising tools for improving generalization in attack detection.