logoalt Hacker News

manmalyesterday at 6:16 PM1 replyview on HN

That’s what, 14GB/s? The GPU‘s VRAM can do 100x that.


Replies

GeekyBearyesterday at 6:27 PM

A discrete consumer GPU card doesn't have enough fast RAM to run a very large model that hasn't been quanitized to hell.

That's why all the projects streaming models into the GPU from an SSD popped up recently.

show 1 reply