You will need 100s of billions to make a viable POC.
For a PoC? That sounds very unlikely. I think you’re off by at least 2–3 orders of magnitude
You only need to train a range of small models in order to establish a plausible scaling law, IMO.
For a PoC? That sounds very unlikely. I think you’re off by at least 2–3 orders of magnitude