We haven't experimented with routing to local LLMs much. Technically they benefit from the cach...

adchurch • yesterday at 9:44 PM • 1 reply • view on HN

We haven't experimented with routing to local LLMs much. Technically they benefit from the cache too although it's more a question of latency than cost. But tbh I haven't seen great results in the wild from working with local LLMs for coding - curious if you've had any success with them?

Replies

thandv • today at 6:20 PM

I generally used them for token saving purposes, just using them for repetitive tasks, gated and supervised by claude. So its planned and verified by better models, but implementation falls on local ones. It has been pretty effective for me, as long as I spend a bit more initially on splitting complex tasks further down

alt Hacker News

Replies