I don't think anything you said here contradicts what they said, they take great pains througho...

logicprog • yesterday at 10:11 PM • 2 replies • view on HN

I don't think anything you said here contradicts what they said, they take great pains throughout the blog post to explain that the model does not "expedience" these "emotions," that they're not emotions in the human sense but models of emotions (both the "expected human emotional response to a prompt" as well as what emotions another character is experiencing in part of a prompt) and functional emotions (in that they can influence behavior), and that any apparent emotions the model may show is it playing a character.

Replies

Kim_Bruning • today at 12:09 AM

Almost! They're merely making the claim of functional emotions and outright avoiding the thorny philosophical question of whether they're "real".

[ I've actually tried exploiting functional emotions in a RAG system. The sentiment scoring and retrieval part was easy. Sentiment analysis is pretty much a settled thing I'd say, even though the mechanisms are still being studied (see the paper we're discussing.

What I'd love to be able to do is be able to extract the vector(s) they're discussing, rather than outputting as text into the context]

orbital-decay • yesterday at 11:58 PM

If you listen to Anthropic in their other works and interviews, they clearly do believe the equivalence-by-proxy between humans and LLMs to a large degree, and introduce things like model welfare (that is, caring about what the model feels). This is just another study in the series. I think they're adding these disclaimers to not sound like absolute cranks to the unprepared audience, because sometimes they really do.

alt Hacker News

Replies