The point of this is to reduce a complex tool surface to a single sql query tool without losing the richness of the underlying representation.
In practice this allows for me to combine multiple, complex data sources with a constant number of tools. I can add a whole new database and not add a new tool. My prompts are effectively empty aside from metadata around the handful of tools it has access to.
This only seems to perform well with powerful models right now. I've only seen it work with GPT5.x. But, when it does work it works at least as well as a human given access to the exact same tools. The bootstrapping behavior is extremely compelling. The way the LLM probes system tables, etc.
The tasks this provides the most uplift for are the hardest ones. Being able to make targeted queries over tables like references and symbols dramatically reduces the number of tokens we need to handle throughout. Fewer tokens means fewer opportunities for error.