logoalt Hacker News

dofmtoday at 4:42 PM0 repliesview on HN

I have just been humbled by the Gemma 4 26B QAT build (unsloth's version), which insisted repeatedly that I am wrong in my requirements for some niche wordpress code, which cannot be satisfied.

I am a good WP developer so I kept prodding it and it kept insisting, and it explained with clarity. Turns out it is right and I was wrong, as I would have found out if I'd written the code myself.

I've been using this particular test for days, experimenting in ways to generate and prompt code. The 4-bit quantisation of the pre-QAT model does not catch this error. And nor can the Qwen 3.6 sparse model, which confidently blazed past it and never mentioned it.

(FWIW neither did plain ChatGPT; maybe Codex would)

Anecdotal, but there you go. I am somewhat weirded out by it.