logoalt Hacker News

andreagrandiyesterday at 7:12 AM13 repliesview on HN

I'm only waiting for OpenAI to provide an equivalet ~100 USD subscription to entirely ditch Claude.

Opus has gone down the hill continously in the last week (and before you start flooding with replies, I've been testing opus/codex in parallel for the last week, I've plenty of examples of Claude going off track, then apologising, then saying "now it's all fixed!" and then only fixing part of it, when codex nailed at the first shot).

I can accept specific model limits, not an up/down in terms of reliability. And don't even let me get started on how bad Claude client has become. Others are finally catching up and gpt-5.3-codex is definitely better than opus-4.6

Everyone else (Codex CLI, Copilot CLI etc...) is going opensource, they are going closed. Others (OpenAI, Copilot etc...) explicitly allow using OpenCode, they explicitly forbid it.

This hostile behaviour is just the last drop.


Replies

super256yesterday at 12:25 PM

OpenAI forces users to verify with their ID + face scan when using Codex 5.3 if any of your conversations was redeemed as high risk.

It seems like they currently have a lot of false positives: https://github.com/openai/codex/issues?q=High%20risk

show 1 reply
abm53yesterday at 11:11 AM

I’m unsure exactly in what way you believe it has gone “down the hill” so this isn’t aimed at you specifically but more a general pattern I see.

That pattern is people complaining that a particular model has degraded in quality of its responses over time or that it has been “nerfed” etc.

Although the models may evolve, and the tools calling them may change, I suspect a huge amount of this is simply confirmation bias.

seuyesterday at 8:21 AM

> Opus has gone down the hill continously in the last week

Is a week the whole attention timespan of the late 2020s?

show 3 replies
ifwintercoyesterday at 7:56 AM

Opus 4.6 genuinely seems worse than 4.5 was in Q4 2025 for me. I know everyone always says this and anecdote != data but this is the first time I've really felt it with a new model to the point where I still reach for the old one.

I'll give GPT 5.3 codex a real try I think

show 3 replies
trillicyesterday at 1:51 PM

The rate limit for my $20 OpenAI / Codex account feels 10x larger than the $20 claude account.

show 1 reply
GorbachevyChaseyesterday at 1:15 PM

I was underwhelmed by Opus4.6. I didn’t get a sense of significant improvement, but the token usage was excessive to the point that I dropped the subscription for codex. I am suspect that all the models are so glib that they can create a quagmire for themselves in a project. I have not yet found a satisfying strategy for non-destructive resets when the systems own comments and notes poisons new output. Fortunately, deleting and starting over is cheap.

dannersyyesterday at 7:35 AM

No offense, but this is the most predicable outcome ever. The software industry at large does this over and over again and somehow we're surprised. Provide thing for free or for cheap, and then slowly draw back availability once you have dominant market share or find yourself needing money (ahem).

The providers want to control what AI does to make money or dominate an industry so they don't have to make their money back right away. This was inevitable, I do not understand why we trust these companies, ever.

show 2 replies
submainyesterday at 5:21 PM

> And don't even let me get started on how bad Claude client has become

The latest versions of claude code have been freezing and then crashing while waiting on long running commands. It's pretty frustrating.

thepaschyesterday at 4:01 PM

> I’m only waiting for OpenAI to provide an equivalet ~100 USD subscription to entirely ditch Claude.

I have a feeling Anthropic might be in for an extremely rude awakening when that happens, and I don’t think it’s a matter of “if” anymore.

neyayesterday at 10:06 AM

It's the most overrated model there is. I do Elixir development primarily and the model sucks balls in comparison to Gemini and GPT-5x. But the Claude fanboys will swear by it and will attack you if you ever say even something remotely negative about their "god sent" model. It fails miserably even in basic chat and research contexts and constantly goes off track. I wired it up to fire up some tasks. It kept hallucinating and swearing it did when it didn't even attempt to. It was so unreliable I had to revert to Gemini.

show 2 replies
WarmWashyesterday at 3:07 PM

My favorite conspiracy explanation:

Claude has gotten a lot of popular media attention in the last few weeks, and the influx of users is constraining compute/memory on an already compute heavy model. So you get all the suspected "tricks" like quantization, shorter thinking, KV cache optimizations.

It feels like the same thing that happened to Gemini 3, and what you can even feel throughout the day (the models seem smartest at 12am).

Dario in his interview with dwarkesh last week also lamented the same refrain that other lab leaders have: compute is constrained and there are big tradeoffs in how you allocate it. It feels safe to reason then that they will use any trick they can to free up compute.

bbstatsyesterday at 1:10 PM

all this because of a single week?

show 1 reply
cactusplant7374yesterday at 11:03 AM

No developer writes the same prompt twice. How can you be sure something has changed?

show 4 replies