> it's them trying to push the models to burn less compute I'm curious, how does usin...

criemen • yesterday at 9:03 PM • 4 replies • view on HN

> it's them trying to push the models to burn less compute

I'm curious, how does using more tokens save compute?

Replies

b65e8bee43c2ed0 • yesterday at 9:20 PM

productivity (tokens per second per hardware unit) increases at the cost of output quality, but the price remains the same.

both Anthropic and OpenAI quantize their models a few weeks after release. they'd never admit it out loud, but it's more or less common knowledge now. no one has enough compute.

➕ show 2 replies

shortstuffsushi • yesterday at 9:11 PM

I think that the idea is each action uses more tokens, which means that users hit their limit sooner, and are consequently unable to burn more compute.

➕ show 1 reply

BoorishBears • today at 12:01 AM

I'm 99.9% sure Opus 4.7 is a smaller model than 4.6.

Too many signs between the sudden jump in TPS (biggest smoking gun for me), new tokenenizer, commentary about Project Mythos from Ant employees, etc.

It looks like their new Sonnet was good enough to be labeled Opus and their new Opus was good enough to be labeled Mythos.

They'll probably continue post-training and release a more polished version as Opus 5

bloppe • yesterday at 9:12 PM

It could be the adaptive reasoning

alt Hacker News

Replies