logoalt Hacker News

kushalast Wednesday at 7:11 AM0 repliesview on HN

Gemma-4-26B-A4B does not require 50+ Gb of vram. It is a MoE model so only 4B of active parameters at a time and not as GPU dependent. I can run it on 16gb of vram and ~20gb of DDR5 regular ram for a 8 bit quant.