Have you all noted that the latest releases (Qwen3 max thinking, now Kimi k2.5) from Chinese compani...

Alifatisk • today at 10:13 AM • 4 replies • view on HN

Have you all noted that the latest releases (Qwen3 max thinking, now Kimi k2.5) from Chinese companies are benching against Claude opus now and not Sonnet? They are truly catching up, almost at the same pace?

Replies

conception • today at 2:11 PM

https://clocks.brianmoore.com

K2 is one of the only models to nail the clock face test as well. It’s a great model.

➕ show 1 reply

WarmWash • today at 2:52 PM

They distill the major western models, so anytime a new SOTA model drops, you can expect the Chinese labs to update their models within a few months.

➕ show 2 replies

esafak • today at 3:12 PM

They are, in benchmarks. In practice Anthropic's models are ahead of where their benchmarks suggest.

➕ show 1 reply

zozbot234 • today at 10:21 AM

The benching is sus, it's way more important to look at real usage scenarios.

alt Hacker News

Replies