logoalt Hacker News

swader999yesterday at 5:14 PM1 replyview on HN

Just swap 'Honesty' with 'correctness in its claims' and you'll get what you need out of this aspect of the model description.


Replies

stratos123yesterday at 9:14 PM

Honesty and correctness are not the same thing, even when talking about LLMs. Sometimes an LLM says a false thing and you don't know whether it's being dishonest or merely incorrect. Sometimes, however, you can see in the CoT that the model does know the true fact and is reasoning about how to deceive the user. That's lying, not just being incorrect.

show 1 reply