logoalt Hacker News

PeterStueryesterday at 4:50 PM0 repliesview on HN

It does not. It just has a fast way to give you the illusion it "runs continuously" with 25GB of warm memory.

Tbh, I'm not sure paged vram could solve this problem for an (assumed) huge cache miss system such as a major LLM server