logoalt Hacker News

capitainenemolast Thursday at 8:01 PM1 replyview on HN

And performs very well on the latest 100 puzzles too, so isn't just learning the data set (unless I guess they routinely index this repo).

I wonder how well AIs would do at bracket city. I tried gemini on it and was underwhelmed. It made a lot of terrible connections and often bled data from one level into the next.


Replies

woogerlast Friday at 11:50 AM

> unless I guess they routinely index this repo

This sounds like exactly the kind of thing any tech company would do when confronted with a competitive benchmark.

show 1 reply