Yeah the future is probably a number of highly specialised small models you can run on your own hard...

Flere-Imsaho • yesterday at 7:37 PM • 3 replies • view on HN

Yeah the future is probably a number of highly specialised small models you can run on your own hardware rather than massive frontier models in the cloud.

That's what I'm betting on anyway.

Replies

thewebguyd • yesterday at 7:43 PM

That seems to be what Microsoft is betting on also based on what was shown at the BUILD keynote today + that new surface ultra and the surface mini PC with the new Nvidia chip. Nadella really played up local AI as the main use case they have in mind.

search_facility • yesterday at 7:49 PM

MOE basically work that way already, QWEN/etc with low active params (A-number in name) allows to inference big models locally (only active params have to fit into memory)

girvo • yesterday at 10:11 PM

Step 3.7 Flash on my Asus GB10 based mini pc is incredibly close to that today. I’m very impressed, and that’s without MTP to boost performance

alt Hacker News

Replies