To me it was already quite intuitive, we are not really managing the psychological state: at its cor...

tarsinge • yesterday at 12:53 PM • 1 reply • view on HN

To me it was already quite intuitive, we are not really managing the psychological state: at its core a LLM try to make the concatenation of your input + its generated output the more similar it can with what it has been trained on. I think it’s quite rare in the LLMs training set to have examples of well thought professional solution in a hackish and urgency context.

Replies

astrange • yesterday at 9:27 PM

No, that's how base model pretraining works. Claude's behavior is more based on its constitution and RLVR feedback, because that's the most recent thing that happened to it.

alt Hacker News

Replies