I learned TCP/IP by watching and reading raw packets over packet radio at 1200 baud.
I've noticed the same thing is possible if you watch the output of a slow LLM. Eventually you start to see the machinery. input tokens = output tokens, it's math. I can't exactly predict the tokens generated but I can see how they are formed. It's a lot like chess. You can't see every possible move but the mechanism is understandable.
It's basically possible build an LLM using just routers+packets, and then hook them up to Wireshark to see it compute!
https://distill.pub/2019/activation-atlas/
I can only imagine what sort of visualizations are going on today inside of the AI labs.
Comment <-> username synergy.