logoalt Hacker News

wslhtoday at 11:07 AM1 replyview on HN

I think it's worth to look at the recent XBOW benchmark: https://xbow.com/blog/mythos-offensive-security-xbow-evaluat... they realized that ChatGPT 5.5 works better so the secret is in the architecture (including humans in the loop).


Replies

baqtoday at 11:15 AM

'frontier tokens are not fungible'