No one talking about how this flash Beats Pro? Imagine what 3.5 pro looks like? Also concerned abo...

simianwords • yesterday at 6:53 PM • 1 reply • view on HN

No one talking about how this flash Beats Pro? Imagine what 3.5 pro looks like?

Also concerned about Gemini models being benchmaxxed generally

Replies

NitpickLawyer • yesterday at 7:12 PM

> concerned about Gemini models being benchmaxxed generally

I would say they are the least benchmaxxed out of all the top labs, for coding. They've always been behind opus/gpt-xhigh for agentic stuff (mostly because of poor tool use), but in raw coding tasks and ability to take a paper/blog/idea and implement it, they've been punching above their benchmarks ever since 2.5. I would still take 2.5 over all the "chinese model beats opus" if I could run that locally, tbh.

➕ show 1 reply

alt Hacker News

Replies