logoalt Hacker News

espadrinetoday at 10:29 AM0 repliesview on HN

How much did this pretraining run cost? I am impressed that it is now practical to do such efforts.

Let me try a guess for the cost; please fact-check it if you can.

They indicate using 10^22 FLOPs. A $5/h[0] EC2 H100 (1671 bfloat16 teraFLOPS[0]) instance will produce 830 TFLOPS at 50% MFU. The pretraining run thus costs (10^22/830e12)/3600*5 = $17K.

[0]: https://aws.amazon.com/ec2/capacityblocks/pricing/

[1]: https://www.nvidia.com/en-us/data-center/h100/