logoalt Hacker News

red75primetoday at 4:38 AM0 repliesview on HN

Reasoning allows to produce statements that are more likely to be true based on statements that are known to be true. You'd need to structure your "falsehood training data" in a specific way to allow an LLM to generalize as well as with the regular data (instead of memorizing noise). And then you'll get a reasoning model which remembers false premises.

You generate your text based on a "stochastic parrot" hypothesis with no post-validation it seems.