> And it gets churned by every single request they receive.
Not true, it gets calculated once and essentially baked into initial state basically and gets stored in a standard K/V prefix cache. Processing only happens on new input (minus attention which will have to content with tokens from the prompt)