I'm sure there is plenty of optimization paths left for them if they're a startup. And imho smaller models will keep getting better. And a great business model for people having to buy your chips for each new LLM release :)
One more thing. It seems like this is a Q3 quant. So only 3GB RAM requirement.
10 H100 chips for 3GB model.
I think it’s a niche of a niche at this point.
I’m not sure what optimization they can do since a transistor is a transistor.
One more thing. It seems like this is a Q3 quant. So only 3GB RAM requirement.
10 H100 chips for 3GB model.
I think it’s a niche of a niche at this point.
I’m not sure what optimization they can do since a transistor is a transistor.