Topic: Towards Exact and Inexact Approximate Matching of Executable Binaries
by Lorenz Liebler
D19/2.03a, April 11, 2019 (Thursday), 12.00 noon
Keywords — binary analysis, approximate matching, malware analysis, approximate disassembling
The application of approximate matching (a.k.a. fuzzy hashing or similarity hashing) is often considered in the field of malware or binary analysis. Recent research showed major weaknesses of predominant fuzzy hashing techniques in the case of measuring the similarity of executables (Pagani et al., 2018). Summarized, well known Context-Triggered Piecewise-Hashing approaches (e.g., ssdeep) are not very reliable for the task of binary comparisons, as even benign changes heavily impact the underlying byte representation of an original binary. In this talk we discuss an approximate matching implementation for the task of binary analysis and binary matching. Our approach unites exact and inexact matching capabilities. A first comparison of our approach against four different fuzzy hashing techniques showed major advantages in nearly all of the considered scenarios. Previous research underlines the volatile nature of schemes in different scenarios. In contrast, our approach is more robust and shows stable results across all considered scenarios.