logoalt Hacker News

gAItoday at 3:33 PM3 repliesview on HN

You're essentially summoning a character to role-play with. Just like with esoteric evocation, it's very easy to summon the wrong aspect of the spirit. Anthropic has a lot to say about this:

https://www.anthropic.com/research/persona-selection-model

https://www.anthropic.com/research/assistant-axis

https://www.anthropic.com/research/persona-vectors


Replies

hammocktoday at 3:40 PM

Unfortunately (after reading your links) all of the control surfaces for mitigating spirit summoning seem to be in the model training, creation and tuning not something you can change meaningfully through prompting.

Perhaps the LLM itself, rather than the role model you created in one particular chat conversation or another, is better understood to be the “spirit.”

As a non-coder who only chats with pre existing LLMs and doesn’t train or tune them, I feel mostly powerless.

show 3 replies
jerftoday at 5:02 PM

I am polite when using AI, not because I mistake it for a human, but because I'm deliberately keeping it in the "professional colleague" persona. Tell it to push back, and then thank it for something it finds in your error. I may put a small self-deprecating joke in from time to time. It keeps the "mood" correct.

Another way you can think of it is that when you're talking to an AI, you're not talking to a human, you're talking to distillation of humanity, as a whole, in a box. You want to be selective in what portion of humanity you are leading to be dominant in a conversation for some purpose. There's a lot in there. There's a lot of conversations where someone makes a good critical point and a flamewar is the response. A lot of conversations where things get hostile. I'm sure the subsequent RHLF helps with that, but it doesn't hurt anything to try to help it along.

I see people post their screenshots of an AI pushing back and asking the user to do it or some other AI to do it, and while I'm as amused as the next person, I wonder what is in their context window when that happens.

show 3 replies
rdevillatoday at 4:38 PM

Spot on.