logoalt Hacker News

siliconc0wyesterday at 3:56 PM2 repliesview on HN

All these new datacenters are going to be a huge sunk cost. Why would you pay OpenAI when you can host your own hyper efficient Chinese model for like 90% less cost at 90% of the performance. At that is compared to today's subsidized pricing, which they can't keep up forever.


Replies

hadlockyesterday at 4:29 PM

Eventually Nvidia or a shrewd competitor will release 64/128gb consumer cards; locally hosted GPT 3.5+ is right around the corner, we're just waiting for consumer hardware to catch up at this point.

GaggiXyesterday at 4:27 PM

>to today's subsidized pricing, which they can't keep up forever.

The APIs are not subsidized, they probably have quite the large margin actually: https://lmsys.org/blog/2025-05-05-large-scale-ep/

>Why would you pay OpenAI when you can host your own hyper efficient Chinese model

The 48GB of VRAM or unified memory required to run this model at 4bits is not free either.

show 1 reply