logoalt Hacker News

areoformyesterday at 5:31 PM1 replyview on HN

I've noticed that sometimes the same Claude model will make logical errors sometimes but not other times. Claude's performance is highly temporal. There's even a graph! https://marginlab.ai/trackers/claude-code/

I haven't seen anyone mention this publicly, but I've noticed that the same model will give wildly different results depending on the quantization. 4-bit is not the same as 8-bit and so on in compute requirements and output quality. https://newsletter.maartengrootendorst.com/p/a-visual-guide-...

I'm aware that frontier models don't work in the same way, but I've often wondered if there's a fidelity dial somewhere that's being used to change the amount of memory / resources each model takes during peak hours v. off hours. Does anyone know if that's the case?


Replies

8organicbitsyesterday at 5:42 PM

I'm not sure that graph shows a time-based correlation. The 60% line stays inside the 95% confidence interval. Is that not just a measurement of noise?