logoalt Hacker News

lukewarm707yesterday at 8:01 AM1 replyview on HN

value is high but what about the competitors?

is claude that good? the last time i tried claude it was sonnet 4.5. it was ok, not worth the api money clearly. but i only use api tokens for llms.


Replies

port11yesterday at 10:28 AM

If you look at SWE, Claude models aren’t that special. Other benchmarks come up with different results.

But… anecdotally, Claude is just that good. Gemini needs a lot of hand-holding, and it will still tell you it’s done when it achieved half the work. Or say, “this test isn’t passing, I’ll just delete it”. Every now and then I get tired of it and give the same task to Sonnet 4.6; 5 minutes later I’m done. Bug fixed, UI properly working, React hooks not being conditionally rendered, theme variables used properly. It’s wonderful.

I’m not sure about large agentic work or deep thinking, but I’m mostly automating away the drudgery of dealing with React Native. I still want to do the deeper work myself, but even there Opus is usually a really good sparing partner.

show 2 replies