logoalt Hacker News

sosodevtoday at 6:16 PM0 repliesview on HN

Note that AA's coding index is only made up of two benchmarks: Terminal-Bench Hard and SciCode. I'm skeptical that it makes a good coding index. It ranks Gemma 4 31B above Deepseek V4 Flash. Having used both of those models for a broad variety of coding tasks I would choose Deepseek every day.