logoalt Hacker News

YetAnotherNickyesterday at 3:47 PM1 replyview on HN

Yeah it's a knowledge benchmark not agentic benchmark.


Replies

esafakyesterday at 3:54 PM

That's like saying coding benchmarks are about memorizing the language syntax. You have to know what to call when and how. If you get the job done you win.

show 1 reply