logoalt Hacker News

peststoday at 2:10 AM0 repliesview on HN

> And it gets churned by every single request they receive.

Not true, it gets calculated once and essentially baked into initial state basically and gets stored in a standard K/V prefix cache. Processing only happens on new input (minus attention which will have to content with tokens from the prompt)