logoalt Hacker News

Sanzigyesterday at 1:54 PM2 repliesview on HN

Take a look at Ollama Cloud: https://ollama.com/pricing

You get access to a whole bunch of bleeding edge open models including GLM-5.2, Kimi K2.7, DeepSeek 4 Pro, etc. Inference is run on US/SG/EU cloud providers with zero data retention policies. The $20/mo tier is very generous, in my experience.


Replies

jeremyjhyesterday at 5:17 PM

They don’t have a statement about where it is run or data retention on the GLM5.2 model. They do state that for others, like MiniMax.

show 1 reply
cmrdporcupineyesterday at 10:05 PM

Well I tried the $20/mo tier and used GLM specifically and did maybe 3-4 hours of work and I'm already through 50% of my monthly tier and blew through my time limited quota twice. I won't renew for another month.

Which I think only underscores my point that actually the GLM models are not very cost effective.

They essentially cost the same as the SOTA models from OpenAI and Anthropic, while not being quite as smart. I could have gotten about the same amount of work done on the $20 Codex plan. And I had to use my $100 Codex plan to finish the work GLM started before it ran out of quota. And also to fix it since GLM left a bit of a mess.

I like that GLM exists. Other Chinese models are far more cost effective. GLM is expensive, even on a fixed plan.