logoalt Hacker News

OsrsNeedsf2Pyesterday at 6:55 PM2 repliesview on HN

So it's trained on the SWE Bench Pro evalset


Replies

topsycattyesterday at 8:49 PM

That's not accurate. Take a look at the paper to see what it is trained on! And specifically decontamination is called out in A.4

https://microsoft.ai/wp-content/uploads/2026/06/main_2026060...

lemonish97yesterday at 6:56 PM

What is your evidence for this claim?

show 1 reply