logoalt Hacker News

sho_hntoday at 2:27 PM4 repliesview on HN

It does all start to feel like we'd get fairly close to being able to convincingly emulate a lot of human or at least animal behavior on top of the existing generative stack, by using brain-like orchestration patterns ... if only inference was fast enough to do much more of it.

The gauge-reading example here is great, but in reality of course having the system synthesize that Python script, run the CV tasks, come back with the answer etc. is currently quite slow.

Once things go much faster, you can also start to use image generation to have models extrapolate possible futures from photos they take, and then describe them back to themselves and make decisions based on that, loops like this. I think the assumption is that our brains do similar things unconsciously, before we integrate into our conscious conception of mind.

I'm really curious what things we could build if we had 100x or 1000x inference throughput.


Replies

moonutoday at 2:57 PM

Idk if you've seen this already but Taalas does this interesting thing where they embed the model directly onto the chip, this leads to super-fast speeds (https://chatjimmy.ai) but the model they're using is an old small Llama model so the quality is pretty bad. But they say that it can scale, so if that's really true that'd be pretty insane and unlock the inference you're talking about.

show 1 reply
Kostictoday at 3:19 PM

Taalas showed that you could make LLMs faster by turning them into ASICs and get 10k+ token generation. It's a matter of time now.

show 1 reply
tootietoday at 4:13 PM

Is emulating human behavior really a valuable end goal though? Humans exist as the evolutionary endpoint of exhaustion hunting large pray and organic tool-making. We've built loads of industrial and residential automation tools in the last 100 years and none of them are humanoid. I'd imagine a household robot butler would be more like R2D2 with lots and lots of arms.

show 2 replies
LetsGetTechnicltoday at 4:04 PM

What if we put slop images into slop machines and got slop^2 back out