logoalt Hacker News

windowshoppingyesterday at 9:10 PM2 repliesview on HN

The part that eludes me is how you get from this to the capability to debug arbitrary coding problems. How does statistical inference become reasoning?

For a long time, it seemed the answer was it doesn't. But now, using Claude code daily, it seems it does.


Replies

ferris-booleryesterday at 10:29 PM

IMO your question is the largest unknown in the ML research field (neural net interpretability is a related area), but the most basic explanation is "if we can always accurately guess the next 'correct' word, then we will always answer questions correctly".

An enormous amount of research+eng work (most of the work of frontier labs) is being poured into making that 'correct' modifier happen, rather than just predicting the next token from 'the internet' (naive original training corpus). This work takes the form of improved training data (e.g. expert annotations), human-feedback finetuning (e.g. RLHF), and most recently reinforcement learning (e.g. RLVR, meaning RL with verifiable rewards), where the model is trained to find the correct answer to a problem without 'token-level guidance'. RL for LLMs is a very hot research area and very tricky to solve correctly.

fc417fc802yesterday at 10:26 PM

Because it's not statistical inference on words or characters but rather stacked layers of statistical inference on ~arbitrarily complex semantic concepts which is then performed recursively.

show 1 reply