logoalt Hacker News

jboss10yesterday at 8:20 PM1 replyview on HN

I have 8GB VRAM, but 32GB sys ram. I can run qwen 3.6 35B at 30 tok/s. I also use pi, and it's smart enough to extend itself(multishot and maybe a few tries)

For you, you could try gemma-4-26B-A4B


Replies

Otternonsenztoday at 3:15 PM

Thank you for the recommendation, and so far, it has been working great (within reason, haha). It doesn’t kill my rig when thinking, but it definitely needs more training wheels to nudge it towards the goal.

It seemed to get the idea of my prompt to extend the footer info (I want it to show the model abilities like tool calling or reasoning where the context percent thing is), made a plan and wrote the file, but then got hung up on implementation because it couldn’t figure out how Pi renders that part of the UI in Powershell

So possibly trying a different terminal might help on that front, haha