Genuine question: is the cost to keep a persistent warmed cache for sessions idling for hours/days not significant when done for hundreds of thousands of users? Wouldn’t it pose a resource constraint on Anthropic at some point?
Related question, is it at all feasible to store cache locally to offload memory costs and then send it over the wire when needed?
[dead]
Related question, is it at all feasible to store cache locally to offload memory costs and then send it over the wire when needed?