logoalt Hacker News

movedx01yesterday at 1:27 PM0 repliesview on HN

Probably the same way other models learned to surpass human ability while being bootstrapped from human-level data - using reinforcement learning.

The question is, do we have good enough feedback loops for that, and if not, are we going to find them? I would bet they will be found for a lot of use cases.