logoalt Hacker News

whimblepoptoday at 1:59 PM1 replyview on HN

Bullshitting is how LLMs work. It doesn't require active encouragement. All it takes is a machine without consciousness or physical access to the world and an actually-lived life. A training set that contains lots of confident answers and few to no refusals doesn't help either.


Replies

otabdeveloper4today at 2:43 PM

It's simpler than that.

An LLM outputs tokens, one-by-one. It stops the loop if it outputs the end-of-text token. Which is, of course, statistically much rarer than any other kind of token.

(This is why you cannot, in general, prompt an LLM with something like "don't answer if the result is correct". It has to output something, by design.)