logoalt Hacker News

ac29last Sunday at 4:00 PM2 repliesview on HN

The very best open models are maybe 3-12 months behind the frontier and are large enough that you need $10k+ of hardware to run them, and a lot more to run them performantly. ROI here is going to be deeply negative vs just using the same models via API or subscription.

You can run smaller models on much more modest hardware but they aren't yet useful for anything more than trivial coding tasks. Performance also really falls off a cliff the deeper you get into the context window, which is extra painful with thinking models in agentic use cases (lots of tokens generated).


Replies

rexreedyesterday at 12:10 PM

You can also run these models on the cloud with Ollama. You might say what's the difference, but these are models whose performance will stay consistent over time, whether run locally or in the cloud. For $200 a year I'm getting some pretty fantastic results running GLM 5.1 and even Minimax 2.7 and Kimi 2.5 and Gemma 4 on Ollama's cloud instances. And if you don't like Ollama's cloud instance, you can run it on your own cloud instance from the very same providers that Ollama is using. They use NVIDIA cloud providers (NCPs) although not sure which ones specifically and claims that the "cloud does not retain your data to ensure privacy and security." [https://ollama.com/blog/cloud-models]

show 1 reply
weavielast Sunday at 7:25 PM

What are the best open models?

show 2 replies