What do you mean? It tests whether the model knows the tools and uses them.
Yeah it's a knowledge benchmark not agentic benchmark.
Yeah it's a knowledge benchmark not agentic benchmark.