What would be the difficulty level for it to just read the machine code; are these models heavily relying on human language for clues?
I have had Claude read usbpcap to reverse engineer an industrial digital camera link. It was like pulling teeth but I got it done (I would not have been able to do it alone)
I had Claude reverse some firmware. I gave it headless ghidra and it spat out documentation for the internal serial protocol I was interested in. With the right tools, it seems to do pretty well with this kind of task.
Paired with Ghidra having a binary, being able to do a memory dump of a live running program, and being able to use wireshark to dump traffic over network/bluetooth/usb is VERY helpful if you don't have the source code.
You use decompilation tools and hope they left debug symbols in and it turns it into somewhat human-readable language which is often enough. Even when you don't binaries use libraries which are known or at some point hit documented interfaces so things can be reasoned about.
It will have to use a disassembler, or write one. I recently casually asked gpt-5.4 to translate the content of a MIDI file to a custom sound programming language. It just wrote a one-shot MIDI parser in Python, grabbed the data, and basically did a perfect translation at first try. Nice.
Reasoning on pure machine code or disassembly is still hit and miss. For better results you can run the binary through a disassembler, then ask an llm to turn that into an equivalent c program, then ask it to work on that. But some of the subtleties might get lost in translation