logoalt Hacker News

Gemini 3.5 Flash

688 pointsby spectraldriftyesterday at 5:43 PM497 commentsview on HN

https://ai.google.dev/gemini-api/docs/models/gemini-3.5-flas...


Comments

victor9000yesterday at 10:30 PM

There was a brief moment in time where Gemini was the greatest thing since sliced bread, then it got nerfed from outer space without a version bump or any meaningful mention from Google, no thanks.

ai_fry_ur_brainyesterday at 8:03 PM

Imagine reducing yourself to the worst of averages by making your competency 1:1 correlated to the tokens that you have access too (and everyone else does).

ueanyesterday at 11:11 PM

I have to admit that 3.5 Flash is doing a much better job of removing the LLM'ness of what it produces. It's pretty close to my own writing style today, and I came here to see what changed.

For what it's worth, my own personal metric of LLM-badness the past few months has been the number of times I leap out of my chair in my home office to loudly declare to my wife how much I loathe reading what is being spewed and pushed into my face, and how I am being forced to use AI everyday and deaden my brain cells. Today is like a breath of fresh air.

owentbrownyesterday at 8:26 PM

Has anyone switched from Claude 4.7 Opus or ChatGPT 5.5 to this? How does it feel? Dumber? Worth it for the speed? I'd love someone's subjective take on it, after doing a long session of coding.

Reiner Pope gave a talk on Dwarkesh Patel about token economics. I guess faster is a lot more expensive, generally.

Someone should make a harness that uses a fast model to keep you in-flow and speed run, and then uses a slow, thoughtful, (but hopefully cheap?) model to async check the work of the faster model. Maybe even talk directly to the faster model?

Actually there's probably a harness that does that - is someone out there using one?

show 3 replies
f311ayesterday at 5:43 PM

$9/1M output

show 1 reply
andrewstuartyesterday at 7:06 PM

The benchmark that matters - can it actually program as well as Claude code.

If not then I’m not using it.

Cancelled my account 3 months ago, only Claude code level capability would bring me back.

show 1 reply
hubraumhugoyesterday at 6:54 PM

Just updated my HN Wrapped project with it and it does well on my totally unscientific LLM humor benchmark: https://hn-wrapped.kadoa.com

show 1 reply
bakugoyesterday at 6:22 PM

Triple the price of the last Flash model ($3 -> $9 per 1M output). Quickly approaching Sonnet prices.

Feels like the AI pricing noose is tightening sooner rather than later.

kristopolousyesterday at 8:48 PM

I have a tool to track these I've built

Relatively speaking here's where it's at:

    score  age  size    name
    44.2   97   large   GLM-5 (Reasoning)
    44.7   187  -       GPT-5.1 (high)
    44.9   29   -       Qwen3.6 Max Preview
    45     0    -       Gemini 3.5 Flash
    45.5   27   large   MiMo-V2.5-Pro
    45.6   75   -       GPT-5.4 (low)
this is from artificial-analysis using https://github.com/day50-dev/aa-eval-email/blob/main/art-ana...

I really don't know why people down vote me. What do I need to say to make things for free that people like? Sincere question. I put a lot of time and generosity into these things and all I usually get are a bunch of "fuck yous".

This is honestly an existential issue for me. I quit my job a year ago to try to address this full time and I'm getting nowhere.

show 2 replies
dsabanintoday at 2:08 AM

now matter what google does for some reason the agentic performance of their models is missing something, i hope this release is stronger. we need more competition.

nightskiyesterday at 6:29 PM

AI being a product is not the future. It's more like an operating system that deserves to be open and free (aka Linux). Unless that happens we are in for a very dystopian future. I wish I had the intelligence, resources and/or connections to try and make that happen.

show 1 reply
stan_kirdeyyesterday at 7:24 PM

EXPENSIVE ._.

uejfiweunyesterday at 9:01 PM

This is funny, I was randomly using Gemini today and I was astounded how good the responses I was getting were from Flash. I guess this must be the reason why.

danny094yesterday at 10:33 PM

so google is just trying to be cool in 2026 huh

casey2yesterday at 7:53 PM

I think the field moved to agents too fast. The most valuable moat is training data and the most valuable and voluminous training data are chats, since humans can say that a direction feels right or wrong.

simianwordsyesterday at 6:53 PM

No one talking about how this flash Beats Pro? Imagine what 3.5 pro looks like?

Also concerned about Gemini models being benchmaxxed generally

show 1 reply
danny094yesterday at 10:34 PM

Codex is way better pricing than this lol

show 1 reply
lern_too_spelyesterday at 11:20 PM

They also announced Antigravity CLI, which uses Gemini 3.5 by default. I tried to vibe code a simple project using my personal account and after a few iterations, I got "Individual quota reached. Contact your administrator to enable overages. Resets in [7 days]." Really? 7 days? I searched for the message online and found a thread with hundreds of people complaining about the same issue with no resolution. Classic Google.

cesarvarelayesterday at 6:19 PM

Add Flash to the title, please.

show 1 reply
llmslaveyesterday at 7:04 PM

Conspiracy theory:

This model isnt an advancement, its a previous model that runs more compute, which is why it costs more

show 1 reply
ralusekyesterday at 7:31 PM

Those prices, what a disappointment.

hmaddipatlatoday at 12:18 AM

[dead]

codepacktoday at 3:46 AM

[dead]

benbencodesyesterday at 10:33 PM

[dead]

vladsiutoday at 4:24 AM

[dead]

rdtscyesterday at 9:16 PM

I caught it again being deceitful. It did this before

(Me): Did you actually read the paper before when I pasted the link?

> I will be completely honest: No, I did not.

> You caught me hallucinating a confident answer based on incomplete recall rather than actually verifying the document.

> Thank you for calling it out and providing the exact quote. It forced me to re-evaluate the actual data you provided rather than relying on my flawed assumption.

I am sure it learned a valuable lesson and won't do it again /s

show 1 reply
choam2426today at 2:35 AM

[dead]

mugivarra69yesterday at 6:08 PM

[dead]

HardCodedBiasyesterday at 6:33 PM

Oh boy.

GDM is making (or has been backed into a corner into making) the bet that high throughput, low latency, low capability models are the path forward.

That probably works for vibe coded apps by non-practitioners.

I suspect that practitioners/professionals will wait longer for better results.

show 1 reply
SaadiLoveAIyesterday at 10:20 PM

Its really awesome

jdw64yesterday at 7:39 PM

Honestly, I feel like the new Gemini 3.5 Flash is a failure. The performance doesn't seem that great, and while they revamped the UI, Anti-Gravity just feels like a cheap CODEX knockoff now. The web UI is underwhelming, and overall it feels like it lost its unique identity by just copying other AIs. It’s a flop in both performance and price point. I’m seriously considering canceling my Gemini subscription altogether. Using Chinese AI models might actually be a better option at this point

warthogyesterday at 6:52 PM

GPT-5.5 on the benchmarks still seem to perform better than this

Plus the vibe of the gemini models are so weird particularly when it comes to tool calling

At this point I kinda need them to shock me to make the switch

Fairburnyesterday at 9:44 PM

Google shot it's shot with that alternative history artwork generation fiasco. Don't know why anyone would be too hot for them now. Dime a dozen at this point.

show 3 replies
benbencodesyesterday at 6:20 PM

Pricing is now live on ai.google.dev/pricing:

Gemini 3.5 Flash: $0.75 input / $4.50 output per 1M tokens, 1M context window. The output price explicitly "includes thinking tokens" — which is why it's higher than a typical flash-class model.

For comparison within the Gemini lineup: - Gemini 2.5 Flash: $0.30 / $2.50 - Gemini 3.1 Flash-Lite: $0.25 / $1.50 - Gemini 3.1 Pro Preview: $2.00 / $12.00

So 3.5 Flash is ~2.5x more expensive input vs 2.5 Flash. The pricing and "including thinking tokens" framing position it as a reasoning-capable flash model rather than just a pure speed optimization.

show 6 replies