logoalt Hacker News

ffsm8yesterday at 3:19 PM1 replyview on HN

Usually they're hemorrhaging performance while training.

From that it's pretty likely they were training mythos for the last few weeks, and then distilling it to opus 4.7

Pure speculation of course, but would also explain the sudden performance gains for mythos - and why they're not releasing it to the general public (because it's the undistilled version which is too expensive to run)


Replies

utopcellyesterday at 5:17 PM

Mythos is speculated to have 10 trillion parameters. Almost certainly they were training it for months.