logoalt Hacker News

skeledrewyesterday at 6:05 PM1 replyview on HN

> it shouldn't cost them more money

As things are currently, better models mean bigger models that take more storage+RAM+CPU, or just spend more time processing a request. All this translates to higher costs, and may be mitigated by particular configs triggered by knowledge that a given client, providing particular guarantees, is on the other side.


Replies

joseda-hgyesterday at 6:26 PM

That’s kind of the point. Even if users can choose which model to use (and apparently the default is the largest one), they could still say (For roughly the same cost): your Opus quota is X, your Haiku quota is Y, go ham. We’ll throttle you when you hit the limit.

show 1 reply