Reasoning models with access to Python have been able to solve 4th grade math homework for over a year now. Prove me wrong: show me a 4th grade math problem they can't handle.
> show me a 4th grade math problem they can't handle
Sure.
"8 7 6 5 4 3 2 1 - add minus signs and parenthesis to get 31."
P.S. There is an answer online and some LLMs will just copy it verbatim. This doesn't count.
> show me a 4th grade math problem they can't handle
Sure.
"8 7 6 5 4 3 2 1 - add minus signs and parenthesis to get 31."
P.S. There is an answer online and some LLMs will just copy it verbatim. This doesn't count.