You can also run these models on the cloud with Ollama. You might say what's the difference, but these are models whose performance will stay consistent over time, whether run locally or in the cloud. For $200 a year I'm getting some pretty fantastic results running GLM 5.1 and even Minimax 2.7 and Kimi 2.5 and Gemma 4 on Ollama's cloud instances. And if you don't like Ollama's cloud instance, you can run it on your own cloud instance from the very same providers that Ollama is using. They use NVIDIA cloud providers (NCPs) although not sure which ones specifically and claims that the "cloud does not retain your data to ensure privacy and security." [https://ollama.com/blog/cloud-models]
Interesting. On the pricing page, there are still limits placed on the usage. How restrictive have you found them?