If you really want to see fully open training pipelines for modern LLMs, Olmo and to a lesser extent Nemotron are what you should look at.
https://github.com/allenai/OLMo
https://github.com/NVIDIA-NeMo/Nemotron
I'm not really familiar with either, but I'm more familiar with Olmo. My impression is Nemotron is newer -- why is it less applicable? Is it not totally open like Olmo?
I'm not really familiar with either, but I'm more familiar with Olmo. My impression is Nemotron is newer -- why is it less applicable? Is it not totally open like Olmo?