logoalt Hacker News

andy_pppyesterday at 4:45 AM1 replyview on HN

Fine tuning these models (at least with PPO or equivalent) requires even more VRAM than inference does, potentially 2-3 times more.


Replies

ruskyesterday at 10:52 AM

You could use PEFT? Operating on only a subset of weights is fairly standard practice nowadays …

show 1 reply