Cursor: Find me another benchmark where Composer 2.5 is a top 10 frontier coding model

mi_lk • today at 12:27 PM • 1 reply • view on HN

Replies

(I work at Cursor) We score well on Terminal-Bench and SWE-bench Multilingual. DeepSWE, not so great yet, as it's more for very long-horizon tasks. We're planning to include more public benchmarks in our next model release.

alt Hacker News

Replies