I've been thinking on how to build a benchmark for this stuff for a while, and d...

vessenes · 2025-03-26T20:16:39 1743020199

Hmm, specifically when it comes to reverse engineering, you have the best benchmark ever - you can check the original code, no?

brokensegue · 2025-03-27T02:58:09 1743044289

that requires LLM as judge

dataangel · 2025-03-27T11:40:19 1743075619

no it doesn't, you just diff against the real source code. probably something more fuzzy/continuous than actual diff, but still

rfoo · 2025-03-30T09:58:39 1743328719

Besides functional equivalence, a significant part of the value in neural decompilation is the symbol (function names, variable names, struct definition including member names) it recovered. So, if the LLM predicted "FindFirstFitContainer" for a function originally called "find_pool", is this correct? Wrong? 26.333% correct?

brokensegue · 2025-03-27T12:28:17 1743078497

Proving that two pieces of code are equivalent sounds very hard (incomputable)