frontier models don't keep trying until they succeed. that's a harness problem and best believe it, the best harness are private and not public.
It is much more of a context window size and model capabilities problem. Local models are not even remotely close in solving complex problems, even when used with the same harness.
It is much more of a context window size and model capabilities problem. Local models are not even remotely close in solving complex problems, even when used with the same harness.