Doesn't context cacheing mostly eliminate this problem? (I suppose for enough context the 90% d...

tibbar • today at 5:28 AM • 1 reply • view on HN

Doesn't context cacheing mostly eliminate this problem? (I suppose for enough context the 90% discount is eventually a lot anyway)

Replies

zozbot234 • today at 5:29 AM

Context caching is really storing the KV-cache for reuse. It saves running prefill for that part of the context, but tokens referencing that KV-cache will still cost more.

alt Hacker News

Replies