What would be the difficulty level for it to just read the machine code; are these models heavily re...

petee • yesterday at 11:24 AM • 5 replies • view on HN

What would be the difficulty level for it to just read the machine code; are these models heavily relying on human language for clues?

Replies

wongarsu • yesterday at 11:33 AM

Reasoning on pure machine code or disassembly is still hit and miss. For better results you can run the binary through a disassembler, then ask an llm to turn that into an equivalent c program, then ask it to work on that. But some of the subtleties might get lost in translation

➕ show 1 reply

dnautics • yesterday at 3:02 PM

I have had Claude read usbpcap to reverse engineer an industrial digital camera link. It was like pulling teeth but I got it done (I would not have been able to do it alone)

estimator7292 • yesterday at 3:22 PM

I had Claude reverse some firmware. I gave it headless ghidra and it spat out documentation for the internal serial protocol I was interested in. With the right tools, it seems to do pretty well with this kind of task.

colechristensen • today at 3:26 AM

Paired with Ghidra having a binary, being able to do a memory dump of a live running program, and being able to use wireshark to dump traffic over network/bluetooth/usb is VERY helpful if you don't have the source code.

You use decompilation tools and hope they left debug symbols in and it turns it into somewhat human-readable language which is often enough. Even when you don't binaries use libraries which are known or at some point hit documented interfaces so things can be reasoned about.

lynx97 • yesterday at 11:40 AM

It will have to use a disassembler, or write one. I recently casually asked gpt-5.4 to translate the content of a MIDI file to a custom sound programming language. It just wrote a one-shot MIDI parser in Python, grabbed the data, and basically did a perfect translation at first try. Nice.

➕ show 1 reply

alt Hacker News

Replies