> There is no way these people have the resources to train a fully fledged LLM, so claiming that is their goal makes me think they don't intend for the LLM to be useful.
Depends on what they are doing and why. but at most big labs, only the final model training happens on the big clusters. a lot of experimentation happens on <500 gpus per dev.
So for fast iteration, this seems fine.
This is the use case for the small NVIDIA boxes that a researcher can have on their desk for $5k and do useful experiments before spending all the grant money on a huge training run for the final product.