logoalt Hacker News

peteeyesterday at 11:24 AM5 repliesview on HN

What would be the difficulty level for it to just read the machine code; are these models heavily relying on human language for clues?


Replies

wongarsuyesterday at 11:33 AM

Reasoning on pure machine code or disassembly is still hit and miss. For better results you can run the binary through a disassembler, then ask an llm to turn that into an equivalent c program, then ask it to work on that. But some of the subtleties might get lost in translation

show 1 reply
dnauticsyesterday at 3:02 PM

I have had Claude read usbpcap to reverse engineer an industrial digital camera link. It was like pulling teeth but I got it done (I would not have been able to do it alone)

estimator7292yesterday at 3:22 PM

I had Claude reverse some firmware. I gave it headless ghidra and it spat out documentation for the internal serial protocol I was interested in. With the right tools, it seems to do pretty well with this kind of task.

colechristensentoday at 3:26 AM

Paired with Ghidra having a binary, being able to do a memory dump of a live running program, and being able to use wireshark to dump traffic over network/bluetooth/usb is VERY helpful if you don't have the source code.

You use decompilation tools and hope they left debug symbols in and it turns it into somewhat human-readable language which is often enough. Even when you don't binaries use libraries which are known or at some point hit documented interfaces so things can be reasoned about.

lynx97yesterday at 11:40 AM

It will have to use a disassembler, or write one. I recently casually asked gpt-5.4 to translate the content of a MIDI file to a custom sound programming language. It just wrote a one-shot MIDI parser in Python, grabbed the data, and basically did a perfect translation at first try. Nice.

show 1 reply