Thanks!! I had disabled that previously while debugging, I can confirm this is helping accuracy from...

rayboy1995 • yesterday at 5:32 PM • 1 reply • view on HN

Thanks!! I had disabled that previously while debugging, I can confirm this is helping accuracy from what I can tell so far. (And speed since the cache is preserved more often!)

Replies

satvikpendem • yesterday at 7:38 PM

Use the MTP models which 2x token generation speed, for example: https://unsloth.ai/docs/models/qwen3.6#mtp-guide

➕ show 1 reply

alt Hacker News

Replies