even sonnet right now has degraded for me to the point of like ChatGPT 3.5 back then. took ~5 hours on getting a playwright e2e test fixed that waited on a wrong css selector. literlly, dumb as fuck. and it had been better than opus for the last week or so still... did roughly comparable work for the last 2 weeks and it all went increasingly worse - taking more and more thinking tokens circling around nonsense and just not doing 1 line changes that a junior dev would see on the spot. Too used to vibing now to do it by hand (yeah i know) so I kept watching and meanwhile discovered that codex just fleshed out a nontrivial app with correct financial data flows in the same time without any fuzz. I really don't get why antrhopic is dropping their edge so hard now recently, in my head they might aim for increasing hype leading to the IPO, not disappointment crashes from their power user base.
"Error: claude-opus-4-6[1m] is temporarily unavailable".
It seems like we're hitting a solid plateau of LLM performance with only slight changes each generation. The jumps between versions are getting smaller. When will the AI bubble pop?
so excited!
I wonder if this one will be able to stop putting my fucking python imports inline LIKE I'VE TOLD IT A THOUSAND TIMES.
Sigh here we go again, model release day is always the worst day of the quarter for me. I always get a lovely anxiety attack and have to avoid all parts of the internet for a few days :/
> indeed, during its training we experimented with efforts to differentially reduce these capabilities
can't wait for the chinese models to make arrogant silicon valley irrelevant
We all know this is actually Mythos but called Opus 4.7 to avoid disappointments, right?
[dead]
[dead]
[dead]
[dead]
[dead]
[dead]
[dead]
[dead]
[dead]
[flagged]
[dead]
[dead]
[dead]
[dead]
[dead]
[dead]
[dead]
I just subscribed this month again because I wanted to have some fun with my projects.
Tried out opus 4.6 a bit and it is really really bad. Why do people say it's so good? It cannot come up with any half-decent vhdl. No matter the prompt. I'm very disappointed. I was told it's a good model
[dead]
[dead]
[dead]
[dead]
[dead]
New model - that explains why for the past week/two weeks I had this feeling of 4.6 being much less "intelligent". I hope this is only some kind of paranoia and we (and investors) are not being played by the big corp. /s
TL;DR; iPhone is getting better every year
The surprise: agentic search is significantly weaker somehow hmm...
[dead]
So many messages about how Codex is better then Claude from one day to the other, while my experience is exactly the same. Is OpenAI botting the thread? I can't believe this is genuine content.
TL;DR; iPhone is getting better every year
The surprise: agentic search is significantly weaker somehow hmm...
[flagged]
[flagged]
> In Claude Code, we’ve raised the default effort level to xhigh for all plans.
Does it also mean faster to getting our of credits?
Backlash on HN for Anthropic adjusting usage limits is insane. There's almost no discussion about the model, just people complaining about their subscription.
Introducing a new upgraded slot machine named "Claude Opus" in the Anthropic casino.
You are in for a treat this time: It is the same price as the last one [0] (if you are using the API.)
But it is slightly less capable than the other slot machine named 'Mythos' the one which everyone wants to play around with. [1]
[0] https://claude.com/pricing#api
[1] https://www.anthropic.com/news/claude-opus-4-7