gemini isn't even that good. just tested 3.5 on usual complex prompts to opus/chat 5.5. me...

GeorgeOldfield • yesterday at 7:53 PM • 3 replies • view on HN

gemini isn't even that good. just tested 3.5 on usual complex prompts to opus/chat 5.5. meh

Replies

k8sToGo • yesterday at 8:19 PM

Are you really comparing flash to opus? Shouldn't you be comparing pro?

➕ show 1 reply

bachmeier • yesterday at 8:52 PM

Who would have guessed that something costing roughly a third as much wouldn't do as well at certain tasks.

kmac_ • yesterday at 8:31 PM

Well, the first impression is that Gemini still goes off the instruction rails easier than other models, but I noticed that it tends to go back to the initial goal without holding a hand, which is a real improvement. It's really interesting that these models behave so differently.

alt Hacker News

Replies