> The secret sauce though is all the datasets, RL training, knowledge of what works from doing all kinds of ablation experiments, and a massive compute moat.
ReAct loops and tool-calling are the critical development feature. They turn a model from something that generates text into something that can independently influence the world around them.
Without agent features, you have just a chatbot.
The big breakthrough is we can interact with the agents using natural language - because of the LLM.
It is the combination of LLM and agent-harnesses that make it look really smart. Agent-harness is a programmatic device that lets us tap into the vast knowledge in the LLM.
It is probabaly true that many TV-commentators fail to appreciate this fact and therefore think LLMs are super-intelligent. No, it is the combination of LLM and the programmatic agent-haness that is the breakthrough.
An interesting thought is that the LLM could in theory code the agent-harrness, start it running every time we interact with it. Currently the agent-harrness I think is pretty static I think. In theory it could be dynamically created for every task. Would that make it better don't know.