logoalt Hacker News

applfanboysbgontoday at 6:39 AM1 replyview on HN

> Deep seek 3.2 is 4% on Arc-AGI 2

Why are you bringing up an outdated Chinese model from 6 months ago to compare to a US model from 6 months ago? The outdated Chinese model will have performance from ~12 months ago, obviously. But today's Chinese model DeepSeek 4 has performance not far from the US model 6 months ago; 46% compared to 52% from 5.2.


Replies

gpt5today at 7:55 AM

Because Deepseek 4.0 is not yet there, but the jump isn't expected to be large. Kimi 2.5 is there and is also scoring low.

show 2 replies