logoalt Hacker News

alansabertoday at 12:35 PM1 replyview on HN

Very interesting. I wonder how much of this is due to the context length. I am unclear on the implementation strategy, you ran this problem as a 1-shot using chat mode, or using each on an agent harness?


Replies

segmondytoday at 1:08 PM

Has nothing to do with context length, they have experience training math models, they have a model that would take gold in IMO and a lean prover. Both have been out for almost a year.