Human-Readable Decompilation
Faculty Mentor
Antonio Espinoza
Presentation Type
Poster
Start Date
4-14-2026 9:00 AM
End Date
4-14-2026 11:00 AM
Location
PUB NCR
Primary Discipline of Presentation
Computer Science
Abstract
Decompilation is an essential process in reverse engineering, malware analysis, security research, software preservation, and software repair. However, existing decompilers prioritize functional correctness over readability, routinely emitting C code with excessive nesting, tangled control flow, and structures that are difficult for human analysts to follow. To address this problem, this project presents a post-processing static analysis tool that targets decompiled C output from Ghidra, improving its readability without altering program logic. Using ANTLR4, the tool constructs a concrete syntax tree from the raw decompiler output. It utilizes a two-pass refactoring flow, which decouples pattern identification from code modification. A series of designed static code transformations are then applied to the abstract syntax tree to eliminate machine-generated artifacts and recover clean, idiomatic C code. The tool was evaluated against 13 binaries from the DARPA Cyber Grand Challenge dataset, successfully applying 555 total transformations. The results demonstrate that static analysis effectively simplifies complex control flow while strictly preserving program logic, reducing the cognitive load for human analysts.
Recommended Citation
Williams-Breth, Aaron, "Human-Readable Decompilation" (2026). 2026 Symposium. 30.
https://dc.ewu.edu/srcw_2026/ps_2026/p1_2026/30
Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Human-Readable Decompilation
PUB NCR
Decompilation is an essential process in reverse engineering, malware analysis, security research, software preservation, and software repair. However, existing decompilers prioritize functional correctness over readability, routinely emitting C code with excessive nesting, tangled control flow, and structures that are difficult for human analysts to follow. To address this problem, this project presents a post-processing static analysis tool that targets decompiled C output from Ghidra, improving its readability without altering program logic. Using ANTLR4, the tool constructs a concrete syntax tree from the raw decompiler output. It utilizes a two-pass refactoring flow, which decouples pattern identification from code modification. A series of designed static code transformations are then applied to the abstract syntax tree to eliminate machine-generated artifacts and recover clean, idiomatic C code. The tool was evaluated against 13 binaries from the DARPA Cyber Grand Challenge dataset, successfully applying 555 total transformations. The results demonstrate that static analysis effectively simplifies complex control flow while strictly preserving program logic, reducing the cognitive load for human analysts.