> GPT-4 now considers self modifying AI code to be extremely dangerous and doesn't like talk...

ericbarrett • today at 12:46 AM • 1 reply • view on HN

> GPT-4 now considers self modifying AI code to be extremely dangerous and doesn't like talking about it. Claude's safety filters began shutting down similar conversations a few months ago, suggesting the user switch to a dumber model.

I speculate that this has more to do with recent high-profile cases of self harm related to "AI psychosis" than any AGI-adjacent danger. I've read a few of the chat transcripts that have been made public in related lawsuits, and there seems to be a recurring theme of recursive or self-modifying enlightenment role-played by the LLM. Discouraging exploration of these themes would be a logical change by the vendors.

Replies

andai • today at 4:59 AM

Heh, well associating a potentially internet-ending line of research with mental illness qualifies as a societal prophylactic.

alt Hacker News

Replies