Why didn’t you offer any real world usage numbers to illustrate your point? I found this unhelpful.
That’s the fair point. The rtk promotional posts point to 60-90% tokens savings and there is no mention how they perform accuracy wise. The commenter below did great job pointing to resource showing caveman, rtk saving just couple bucks on $926 bill. Thanks, Llyoyd Christmas for linking to useful substack
I read another post oddly similar earlier today that has more explicit data on that authors codebase: https://codepointer.substack.com/p/cutting-llm-token-costs-w...
TLDR; ~3-4% savings to actual API costs with rtk, caveman, and headroom combined, but nothing tangible on if those cost reductions came at a cost of quality. By their calculations, rtk saved them $4.96 on a $926 bill.