Were those 16 mln sessions used only for alignment, chat format, reasoning, etc.? Or it's possi...

kgeist • today at 7:23 PM • 0 replies • view on HN

Were those 16 mln sessions used only for alignment, chat format, reasoning, etc.? Or it's possible to train a base model too? If a single session is at least 32k tokens, then it's already 0.5 trillion tokens to train on, interesting.

alt Hacker News