Andreas Nautsch successfully defended his Master thesis „Speaker Verfication using i-Vectors“, after working with da/sec and atip GmbH in the period of October 2013 to April 2014.
Speaker verification becomes more important as a biometric key security solution in industry, forensic, and governmental terms. For ensuring purposes of data privacy, telephone-based authentication concepts get more popular e. g., data encryption on mobile devices, or user validations on contact-centers. Further, forensic speech analysis are relevant to i. e. lawsuits where the origin of a recorded cry for help is decision-making to distinguish between self-defence or homicide. Current researches emphasise on text-independent scenarios which e. g., verify on randomised pass-phrases in short duration effort, and on analysing duration-variant speech samples which comprise durations of one second up to many minutes. Thereby, speaker characteristics are modelled by statistical patterns where state-of-the-art research systems prefer template-probe to model-based comparisons, since model-based approaches were shown to be less accurate and having too high computational efforts in duration-variant scenarios. In contrast, template-based systems are known to have disadvantages in short-term scenarios. State-of-the-art researches comprise identity vectors (i-vectors) which describe the speaker-characteristic offset to an universal background model.
The applicability of i-vectors will be evaluated in this thesis by comparing i-vector system to well-established model-based approaches on an industry short duration scenario. Thereby, the i-vector approach will be shown not only to operate robust and fast, but also augment existing technologies, such that equal error rates below 0.5% can be achieved. Further, a new duration-mismatch compensation technique will be presented that increases the robustness and performance of i-vector systems in duration-variant scenarios. This new method was evaluated within a current international evaluation of the National Institute of Standards and Technology (NIST) which examines stateof- the-art i-vector systems: the NIST baseline system could be significantly outperformed by a 19% relative-gain in terms of minimum detection costs. Furthermore, this thesis provides a speaker verification framework design which is based on the ISO/IEC 19795-1:2006 Biometric Performance Testing and Reporting — Part 1: Principles and Framework standard.