logoalt Hacker News

zahlmanyesterday at 10:35 PM1 replyview on HN

Definitely not my wheelhouse, but I would expect it to be considerably worse.

Simply because the source code contains names that were intended to communicate meaning in a way that the LLM is specifically trained to understand (i.e., by choosing identifier names from human natural language, choosing those names to scan well when interspersed into the programming language grammar, including comments etc.). At least if debugging information has been scrubbed, anyway (but the comments definitely are). Ghidra et. al. can only do so much to provide the kind of semantic content that an LLM is looking for.


Replies

tverbeureyesterday at 11:18 PM

I've cut-and-pasted some assembly code into the free version of ChatGPT to reverse engineer some old binaries and its ability to find meaning was just scary.