Liquid does amazing work, but I kinda feel like they are overtraining their models. 38T tokens seems...

Ifkaluva • yesterday at 9:40 PM • 1 reply • view on HN

Liquid does amazing work, but I kinda feel like they are overtraining their models. 38T tokens seems like a lot for an 8B model

andai • yesterday at 9:50 PM

What's the downside? Don't they stop when they hit diminishing returns?

➕ show 2 replies

alt Hacker News