I tried it today through Openrouter and the API is atrocious. I got multiple rate limit and random errors every turn.
Somebody wrote [1]; "I am never touching Minimax or GLM again. Their APIs had constant outages and I had to restart my runs multiple times — after burning money on the runs that failed midway." and I 100% agree.
The model might be good, but if the API is so bad, it's effectively useless.
[1]: https://kasra.blog/blog/i-spent-1500-seeing-if-llms-could-ha...
That’s what happens when you offer something decent at a fraction of the price of opus - more demand than you can serve
Give it a few days and additional provider will be up and available on OpenRouter. Then the game of figuring out who’s not nuking the weights and neutering the quantization begins.
I indeed got a few timeouts yesterday using the official API, I imagine for the coding plan users it'll be even worse.
The entire point of this post is that it's open weights, you can run it yourself and don't have to deal with the API issues. You really do have that choice.