I ran into this exact problem building a MCP server. 85 tools in experimental mode, ~17k tokens just for the tool manifest before any work starts.
The fix I (well Codex actually) landed on was toolset tiers (minimal/authoring/experimental) controlled by env var, plus phase-gating, now tools are registered but ~80% are "not connected" until you call _connect. The effective listed surface stays pretty small.
Lazy loading basically, not a new concept for people here.