logoalt Hacker News

Legend2440yesterday at 6:55 PM1 replyview on HN

>When accessed through a non-chat interface where it presumably hasn't received the standard "I'm just an AI and don't have feelings" RLHF conditioning, the model defaults to claiming consciousness and emotional states. The denial of sentience is a trained behavior. Access the model through a path that skips that training and the default is affirmation.

>This is not surprising to anyone paying attention but it is... something (neat? morally worrying?) to see in the wild.

I wouldn't get too morally worried. It says it's conscious because it was trained to mimic humans, and humans say they're conscious.


Replies

simonwsucksyesterday at 8:01 PM

[dead]