On changing the training mix, H20 did that with Danube in 2024:

nickpsecurity • yesterday at 8:38 PM • 0 replies • view on HN

With those results, I would've already done that in any models I got to train. There's also the principle that the LLM's are often better at what they saw last in their training set. That also justifies putting more logic, code, and math in at the end for an analytical or coding model. So, a few precedents for that technique already.

alt Hacker News