Torsten Schlett defended his Master Thesis on: Enhancement of 3D-Data for Face Recognition

Torsten Schlett defended his Master Thesis on: Enhancement of 3D-Data for Face Recognition

Face recognition can benefit from the utilization of depth data, including for presentation attack detection purposes. This work begins with an overview of inexpensive “consumer” depth cameras. Depth video output from these devices can however contain defects such as holes, as well as general depth inaccuracies.

Therefore the primary part of this work researches a variety of depth enhancement methods, divided into two categories: General enhancer types stemming from RealSense SDK post-processing filters, which are not specifically designed to enhance facial depth input. And deep learning enhancers, which are artificial neural networks using U-Net-like architectures that were created as part of this work. All enhancer types exclusively use depth data as input, which differs from methods that enhance depth based on e.g. visible light color data.

Due to the noted apparent lack of real-world camera datasets with suitable properties, face depth ground truth images and degraded forms thereof are synthesized, both for the deep learning training and for an experimental quantified evaluation of all enhancer types. Enhancer output samples are nevertheless also presented for real camera data, namely custom RealSense D435 depth images and Kinect v1 data from the KinectFaceDB, with special attention given to the description of both devices. It is concluded that the deep learning enhancement approach is superior to the tested general enhancers, without overly falsifying depth data when non-face input is provided. Furthermore it is extrapolated that, given a finite amount of additional development time, more optimized deep learning network architectures and training procedures can still be achieved, whereas “hand-crafted” enhancement methods are less likely to attain comparable or better results in the same timeframe. The implemented system, which i.a. incorporates PRNet-usage for the ground truth synthesis, is described in detail for potential future work, for which a number of different topical options are proposed.