logoalt Hacker News

gertjandewildetoday at 3:25 PM1 replyview on HN

We built a unified API with a large surface area and ran into a problem when building our MCP server: tool definitions alone burned 50,000+ tokens before the agent touched a single user message.

The fix that worked for us was giving agents a CLI instead. ~80 tokens in the system prompt, progressive discovery through --help, and permission enforcement baked into the binary rather than prompts.

The post covers the benchmarks (Scalekit's 75-run comparison showed 4-32x token overhead for MCP vs CLI), the architecture, and an honest section on where CLIs fall short (streaming, delegated auth, distribution).


Replies