Wow, this is refreshing DX compared to iterating all messages like we did back in '24.

drewnick • today at 1:21 AM • 1 reply • view on HN

Replies

I would disagree. Having all the messages locally and sending them with the request means you can switch inference providers or even models mid-conversation. It also means that the provider doesn't store the entire context, which often contains massive parts of proprietary codebases, secrets and PII and instead the agent harness manages all that. While a simple `continue thread` API field might seem more convenient, the cost is still determined by the input token count and cache rate, so it just abstracts this crucial implementation detail away.

alt Hacker News

Replies