Context caching is really storing the KV-cache for reuse. It saves running prefill for that part of...

zozbot234 • today at 5:29 AM • 0 replies • view on HN

Context caching is really storing the KV-cache for reuse. It saves running prefill for that part of the context, but tokens referencing that KV-cache will still cost more.

alt Hacker News