logoalt Hacker News

ashatertoday at 4:53 PM0 repliesview on HN

Likely reasoning is part of the original model. It is well known that it is not possible to get a 1bn parameter model to reason, even with RL.