We've gone from yearly releases to quarterly releases.
If the pace of releases continues to accelerate - by mid 2027 or 2028 we're headed to weekly releases.
Gemini is so stubborn, and often doesn’t follow explicit and simple instructions. So annoying
> Last week, we released a major update to Gemini 3 Deep Think to solve modern challenges across science, research and engineering. Today, we’re releasing the upgraded core intelligence that makes those breakthroughs possible: Gemini 3.1 Pro.
So this is same but not same as Gemini 3 Deep Think? Keeping track of these different releases is getting pretty ridiculous.
The CLI needs work, or they should officially allow third-party harnesses. Right now, the CLI experience is noticeably behind other SOTA models. It actually works much better when paired with Opencode.
But with accounts reportedly being banned over ToS issues, similar to Claude Code, it feels risky to rely on it in a serious workflow.
Fine, I guess. The only commercial API I use to any great extent is gemini-3-flash-preview: cheap, fast, great for tool use and with agentic libraries. The 3.1-pro-preview is great, I suppose, for people who need it.
Off topic, but I like to run small models on my own hardware, and some small models are now very good for tool use and with agentic libraries - it just takes a little more work to get good results.
Someone needs to make an actual good benchmark for LLM's that matches real world expectations, theres more to benchmarks than accuracy against a dataset.
Google seems to really pull ahead in this AI race. For me personally they offer the best deal and although the software is not quiet there compared to openai or anthropic (in regards to 1. web GUI, 2. agent-cli). I hope they can fix that in the future and I think once Gemini 4 or whatever launches we will see a huge leap again
Does anyone know if this is in GA immediately or if it is in preview?
On our end, Gemini 3.0 Preview was very flakey (not model quality, but as in the API responses sometimes errored out), making it unreliable.
Does this mean that 3.0 is now GA at least?
Writing style wise, 3.1 seems very verbose, but somehow less creative compared to 3.
The eventual nerfing gives me pause. Flash is awesome. What we really want is gemini-3.1-flash :)
There's a very short blog post up: https://blog.google/innovation-and-ai/models-and-research/ge...
Great model until it gets nerfed. I wish they had a higher paid tier to use non nerfed model.
I’m keen to know how and where are you using Gemini.
Anthropic is clearly targeted to developers and OpenAI is general go to AI model. Who are the target demographic for Gemini models? ik that they are good and Flash is super impressive. but i’m curious
Another preview release. Does that mean the recommended model by Google for production is 2.5 Flash and Pro? Not talking about what people are actually doing but the google recommendation. Kind of crazy if that is the case
I use Gemini flash lite in a side project, and it’s stuck on 2.5. It’s now well behind schedule. Any speculation as to what’s going on?
Yeah great, now can I have my pinned chats back please?
https://www.google.com/appsstatus/dashboard/incidents/nK23Zs...
Gemini 3.0 Pro is bad model for its class. I really hope 3.1 is a leap forward.
My first impression is that the model sounds slightly more human and a little more praising. Still comparing the ability.
Why should I be excited?
It's been hugged to death. I keep getting "Something went wrong".
It's fascinating to watch this community react to positively to Google model releases and so negatively toward OpenAI's. You all do understand that an ad revenue model is exactly where Google will go, right?
Somehow doesn't work for me :) "An internal error has occurred"
I hereby allow you to release models not at the same time as your competitors.
The biggest increase is LiveCodeBench Pro: 2887. The rest are in line with Opus 4.6 or slightly better or slightly worse.
Humanity last exam 44%, Scicode 59, and that 80, and this 78 but not 100% ever.
Would be nice to see that this models, Plus, Pro, Super, God mode can do 1 Bench 100%. I am missing smth here?
Appears the only difference to 3.0 Pro Preview is Medium reasoning. Model naming has long gone from even trying to make sense, but considering 3.0 is still in preview itself, increasing the number for such a minor change is not a move in the right direction.
ok , so they are scared that 5.3 (pro) will be released today/tomorrow and blow it out of the water and rushed it while they could still reference 5.2 benchmarks.
biggest problem is that it's slow. also safety seems overtuned at the moment. getting some really silly refusals. everything else is pretty good.
I hope to have great next two weeks before it gets nerfed.
Google is terrible at marketing, but this feels like a big step forward.
As per the announcement, Gemini 3.1 Pro score 68.5% on Terminal-Bench 2.0, which makes it the top performer on the Terminus 2 harness [1]. That harness is a "neutral agent scaffold," built by researchers at Terminal-Bench to compare different LLMs in the same standardized setup (same tools, prompts, etc.).
It's also taken top model place on both the Intelligence Index & Coding Index of Artificial Analysis [2], but on their Agentic Index, it's still lagging behind Opus 4.6, GLM-5, Sonnet 4.6, and GPT-5.2.
---
[1] https://www.tbench.ai/leaderboard/terminal-bench/2.0?agents=...
Just wish iI could get 2.5 daily limit above 1000 requests easily. Driving me insane...
More discussion: https://news.ycombinator.com/item?id=47075318
Please I need 3 in ga…
Ok, why don't you work on getting 3.0 out of preview first? 10 min response time is pretty heinous
Relatedly, Gemini chat seems to be if not down then extremely slow.
ETA: They apparently wiped out everyone's chats (including mine). "Our engineering team has identified a background process that was causing the missing user conversation metadata and has successfully stopped the process to prevent further impact." El Mao.
To use in OpenCode, you can update the models it has:
opencode models --refresh
Then /models and choose Gemini 3.1 ProYou can use the model through OpenCode Zen right away and avoid that Google UI craziness.
---
It is quite pricey! Good speed and nailed all my tasks so far. For example:
@app-api/app/controllers/api/availability_controller.rb
@.claude/skills/healthie/SKILL.md
Find Alex's id, and add him to the block list, leave a comment
that he has churned and left the company. we can't disable him
properly on the Healthie EMR for now so
this dumb block will be added as a quick fix.
Result was: 29,392 tokens
$0.27 spent
So relatively small task, hitting an API, using one of my skills, but a quarter. Pricey!Doesn't show as available in gemini CLI for me. I have one of those "AI Pro" packages, but don't see it. Typical for Google, completely unclear how to actually use their stuff.
I always try Gemini models when they get updated with their flashy new benchmark scores, but always end up using Claude and Codex again...
I get the impression that Google is focusing on benchmarks but without assessing whether the models are actually improving in practical use-cases.
I.e. they are benchmaxing
Gemini is "in theory" smart, but in practice is much, much worse than Claude and Codex.
The visual capabilities of this model are frankly kind of ridicioulus what the hell.
I know Google has anti-gravity but do they have anything like Claude code as far as user interface terminal basically TUI?
Whoa, I think Gemini 3 Pro was a disappointment, but Gemini 3.1 Pro is definitely the future!
Can we switch from Claude Code to Google yet?
Benchmarks are saying: just try
But real world could be different
I have run into a surprising number of basic syntax errors on this one. At least in the few runs I have tried it's a swing and a miss. Wonder if the pressure of the Claude release is pushing these stop gap releases.