Just to think what this will look like in a couple of years.
I started with a 2400baud modem, I've seen how this goes
Sometimes I visualize a setup like this [0], based on 2D art by Simon Stålenhag. Someone has their home robot sitting on a desk connected to their old PC with thick cabling, dumping endless lines of each subsystem's <think> logs to diagnosis why it did something weird earlier in the day. Systems pushing 750+ tokens per second per subsystem might even be considered on the slow side for realtime tasks by then.
Probably will not be looking at text like this in a few years.
Probably not. Everyone will still need a lot of reasoning tokens and tool calls. Running the tests for every round is tiring but must be done.
probably something like this https://sb0xw.csb.app/
Hopefully like this (but smarter): https://chatjimmy.ai/