logoalt Hacker News

irthomasthomasyesterday at 5:38 PM1 replyview on HN

Given that 4.7 was a brand new model, trained from scratch with a unique architecture and tokenization scheme, I don't see the same pattern. It seems arbitrary.


Replies

dominotwyesterday at 5:41 PM

i dont understand the nuances here. what does this mean. 4.8 is trained on same model as previous one then? what does brand new mean.

show 1 reply