The impetus to continue training at the pace they are is driven by the competition. So if the money ...

naravara • yesterday at 1:39 PM • 3 replies • view on HN

The impetus to continue training at the pace they are is driven by the competition. So if the money starts drying up, then they’ll naturally slow down because they’ll have to figure out how to do more with less.

I suspect that once the models hit a point of “good enough” for certain use cases companies will start putting R&D focus in other areas that may be less expensive. Like figuring out how to run more efficiently, UI/UX conventions that help users get what they’re trying to accomplish in fewer steps, various kinds of caching of requests, etc. So the cost to serve tokens over time should only come down, and will probably start coming down more rapidly as the returns to model training slow down.

That’ll probably be a while though, because each successive model tends to be a lot better than the last.

Replies

WarmWash • yesterday at 2:34 PM

What's interesting to note is that the "intelligence" labs can squeeze out of an H100, an almost 4 year old GPU, is dramatically higher than what they got out of it in 2022.

It hints that once these labs get a good enough "everyday model", they can work on efficiency so they can serve these models on old hardware. Which is almost certainly already happening.

pier25 • yesterday at 2:46 PM

> So if the money starts drying up, then they’ll naturally slow down because they’ll have to figure out how to do more with less.

Meanwhile companies like Google will keep investing on training...

Anthropic's CEO has suggested all AI companies should slow down training but obviously this is only beneficial for companies that can't afford to keep training.

hbn • yesterday at 4:37 PM

> UI/UX conventions that help users get what they’re trying to accomplish in fewer steps

If we can expect the past 15 years of software UI/UX history to continue, it's more likely they'll spend the money on making the UI/UX more confusing, removing features, and making basic tasks take more steps than they do today.

alt Hacker News

Replies