logoalt Hacker News

muddi900yesterday at 6:06 PM5 repliesview on HN

z.ai will use quantized models in off hours. Buyer beware


Replies

NekkoDroidtoday at 9:19 AM

This... doesn't make sense. Why would they use a quantized model when load is low and the full model when load is high???

yogthosyesterday at 7:00 PM

I have a subscription and I have not seen any difference in performance during on/off hours. What exactly are you basing this on?

_aavaa_yesterday at 6:49 PM

Do you have proof for this?

show 1 reply
desireco42yesterday at 8:06 PM

I hear a lot of people complaining, I am on their Max plan, I never hit limits, use it non-stop and overall it has been fantastic experience.

show 2 replies
jauntywundrkindtoday at 6:47 AM

I was one of the people just absolutely in misery when the GLM-5.1 model dropped. It wasn't quantized, I don't think, but it had some very gnarly issues where it would hit a context size, then seemingly try to quantize, and fall apart. It was unusable. It went from being an excellent model all the way to 200k, to being only 60k before it couldn't write in sentances and definitely couldn't tool call, to being 100k, to 120k. It was terrible, and I was so sad they had made my subscription so much worse, it felt like. https://news.ycombinator.com/item?id=47677853

But very shortly after this submission/release of 5.1, after a mass pouring out of sadnesses, they fixed it. Things have been back to absolutely amazing. I joined right before 4.7, and 4.7 was incredible. 5.0 was fantastic. 5.1 has been a dream. GPT still catches a lot of stuff and is smarter, but man, GLM-5.1 is so capable, and it's frankly often a better writer, often better understands and captures purpose and notion, where-as GPT often feels dry and focused on narrow technicals. I really appreciate GLM-5.1.

And I'm really glad Z.ai fixed the absurd damage they had in their systems. I do suspect they were trying to dynamically quantize as the context window grew, or some such trickery. It was not working at all, but somehow it tooks months to fix.