CactusCon

CactusCon13
February 14-15, 2025
Mesa, AZ

DIRE: Renaming Variables in Decompiled Code with Neural Nets

Decompilers transform binaries into high-level source code, and are a critical part of the working hacker's arsenal of tools for malware analysis, reverse engineering, and exploit development. Over time, decompilers have become increasingly sophisticated in reconstructing information lost during the compilation process (e.g., code structure and type information). A longstanding issue is recovering meaningful variable names that correspond to the intent of the original code.

This talk presents DIRE (the Decompiled Identifier Renaming Engine), a new probabilistic technique that uses both lexical and structural information to recover variable names. Where current state-of-the-art tools recover variable names like "a1" or "iVar", DIRE correctly recovers meaningful variable names like "filename". We present our approach for training and evaluating models of decompiled code using a large corpus of 164,632 unique x86-64 binaries mined from C projects on GitHub. We share our results of DIRE's large scale application and show that it can predict variable names identical to the names in the original source code up to 74.3% of the time.

You'll leave this talk with knowledge of new techniques in binary decompilation, and practical tooling for more accurate variable name recovery. You'll also learn about state-of-the-art approaches in decompilation, outstanding challenges, and new ways for addressing these challenges.

Jeremy Lacomis

Jeremy Lacomis is a Ph. D. student in the Institute for Software Research at Carnegie Mellon University. His research interest is in search-based software engineering, automated code and binary transformation, and improving tooling for reverse engineers. Jeremy holds a B.A., Computer Science from the University of Virginia and an A.S., Computer Science from Piedmont Virginia Community College.