surely training also gets cheaper so justifying it becomes easier? i think it'll be more like...

merlindru • yesterday at 5:21 PM • 1 reply • view on HN

surely training also gets cheaper so justifying it becomes easier?

i think it'll be more like we get 1-10T models and then distill those down into smaller models, though

It seems like the best small models today are all distilled from bigger models

Moreover, I hypothesize Claude Opus 4.7 and now 4.8 are a distillation of Claude Mythos

pseudohadamard • today at 2:39 PM

That's the impression I got too, it seems closer to what the marketing has told us about Mythos than 4.6/4.7 were.

alt Hacker News