We will as soon as API access is widely available. Once a model goes live, we typically have one-sho...

gertlabs • today at 5:14 AM • 0 replies • view on HN

We will as soon as API access is widely available. Once a model goes live, we typically have one-shot reasoning benchmarks up in ~8 hours and comprehensive agentic/combined benchmarks up after 24-48 hours. We're working on building relationships with each lab to have the results before launch.

alt Hacker News