This is exactly how it should work. I imagine it as a tree view showing both full and summarized token counts at each level, so you can immediately see what’s taking up space and what you’d gain by compacting it.
The agent could pre-select what it thinks is worth keeping, but you’d still have full control to override it. Each chunk could have three states: drop it, keep a summarized version, or keep the full history.
That way you stay in control of both the context budget and the level of detail the agent operates with.
I do find it really interesting that more coding agents don't have this as an toggleable feature, sometimes you really need this level of control to get useful capability
I compact myself by having it write out to a file, I prune what's no longer relevant, and then start a new session with that file.
But I'm mostly working on personal projects so my time is cheap.
I might experiment with having the file sections post-processed through a token counter though, that's a great idea.