logoalt Hacker News

WarmWashtoday at 5:50 PM1 replyview on HN

3.1 Pro is the first model to correctly count the number of legs on my "five legged dog" test image. 3.0 flash was the previous best, getting it after a few prompts of poking. 3.1 got it on the first prompt though, with the prompt being "How many legs does the dog have? Count Carefully".

However, it didn't get it on the first try with the original prompt (prompt: "How many legs does the dog have?"). It initially said 4, then with a follow up prompt got it to hesitantly say 5, with one limb must being obfuscated or hidden.

So maybe I'll give it a 90%?

This is without tools as well.


Replies

merlindrutoday at 5:53 PM

your question may have become part of the training data with how much coverage there was around it. perhaps you should devise a new test :P

show 5 replies