The other half of AI safety

78 points • by sofiaqt • today at 12:27 AM • 96 comments • view on HN

Comments

> Every week, somewhere between 1.2 and 3 million ChatGPT users, roughly the population of a small country, show signals of psychosis, mania, suicidal planning, or unhealthy emotional dependence on the model.

> Why is mental-health crisis not a gating category, the kind where the conversation stops, full stop, and the user is routed to a human?

Well, obviously “routing to a human” is not feasible at that scale. And cold exiting the conversation is probably worse for the user than answering carefully.

➕ show 6 replies

Legend2440 • today at 1:03 AM

I don't buy that chatGPT is actually doing these users any harm.

I think openAI is doing the best they reasonably can with a very difficult class of users, whose problems are neither their fault nor within their power to fix.

➕ show 9 replies

ianbutler • today at 1:08 AM

OpenAI has 900 million weekly active users. So around 0.01% are having problems. That's actually way less than population level measures for the same symptoms on a bigger percentage of people relative to the US on just suicidal ideation alone.

https://www.cdc.gov/mmwr/volumes/74/wr/mm7412a4.htm

➕ show 2 replies

ngruhn • today at 12:54 AM

The bad cases make headlines. But I think it's quite possible that AI is helping a lot of people in distress. Many people are uncomfortable opening up to humans, or have no one to talk to, or can't afford to fork over whatever-hourly-rate a therapist takes.

➕ show 4 replies

js8 • today at 5:57 AM

I really enjoyed Dr.K's videos on AI psychosis, namely:

https://www.youtube.com/watch?v=MW6FMgOzklw

https://www.youtube.com/watch?v=BzsLbHoNXTs

I would suggest to people, run your ideas through other humans at least as much as you do through AI, to stay grounded. I think there is a risk even if you're using AI in strictly professional capacity (to help you with your job).

timf34 • today at 1:21 AM

I sympathize with the piece, evaluating how LLMs interact with mentally vulnerable users is something I've been actively working on: https://vigil-eval.com/

The biggest observation so far is that the latest models are night and day from LLMs from even 6 months ago (from OpenAI + Anthropic, Google is still very poor!)

➕ show 1 reply

Yokohiii • today at 2:11 AM

I don't think that governments or civil society at large have found a good balance about mental health. Expecting profit oriented companies to be on par or better is weird.

Don't get me wrong, mental health is important and should be considered and improved. But companies wont do it just for the sake of it.

xg15 • today at 8:01 AM

I find it somewhat telling that most (not all) of this thread doesn't even attempt to find an answer to the questions posed by the OP but flatly denies the problem of psychological harm exists at all.

I feel this is an example of the two larger narratives about AI that currently seem to be forming:

For one side, AI is basically every harmful technology ever invented rolled into one: It's harmful to the environment (via waste of energy and resources), it's harmful to the information space (through polluting everything with slop and devaluing human expression), it's harmful to society (by encouraging ever more badly done and unreliable products, by taking away jobs and by replacing human-to-human interaction, by normalizing a mode of development where not even the developers understand what is going on) and it's harmful to whoever uses it personally (by causing ever-growing dependence on AI, either only by skills or even emotionally or psychically, up to the point of AI psychosis and preferring AI agents to other humans).

For the other side, AI is the future, the next industrial revolution, the thing that you have to adapt or will be left behind, possibly even the next stage of evolution.

Right now, I feel every side is digging in and trying ever harder to ignore the other side.

(The AI labs acknowledge "AI risks" in theory - but, as the article pointed out, the risks they perceive and ostensibly work against are so abstract and removed from the everyday use of AI that they more make the point of AI proponents)

I feel the end result of this growing tension is the Molotov cocktail in Sam Altmann's home.

I'd really like to know more what the tech community at large is trying to do about this rift.

insumanth • today at 6:00 AM

The "route to a human" part is the bigger gap. Which human? OpenAI isn't licensed as a healthcare provider in any jurisdiction. A real intervention apparatus for 1-3M weekly flagged users is not feasible. I don't think the labs have refused to build it. I think nobody knows what it should look like, and "labs measure what they're pressured to measure" papers over that.

totetsu • today at 5:59 AM

Gemini told me just this morning that there are three pillars of cognitive decline related to AI use. - Reduced ability to exert cognitive effort resulting from habitual offloading of tasks. - Deminished Meta-cognitive Self-Trust, due to constantly seeking external validation from AI. - Decline in memory Encoding, and less brain effort is spent processing information. In all seriousness however, I think some of the interesting things to observe in this areas are; the reaction against the word 'Safety' as a whole and its replacement with 'Security'. Safety seeming to have it's roots in like the work of Ralph Nader with automobiles, and Security being some thing that can be manifactured and sold. In this sense I wonder how the discourses of 'Personal AI Safety' fit into past discussions of the offloading of risks resulting form choices of corperations onto individuals. But in the case of LLMs .. it really is the case that what makes it useful is what makes it dangerous. And ultimately because, of the high-dimensionality of the language space they are encoding, it seems impossible to make any technical barrier that can completely cut off access to parts of that space that encode for for example encouraging someone to kill themselves. Things can, and are done, it fine-tuning, pre- and post-filtering, etc, to reduce the readiness for a system to share with a user this kind of output, but all it can ever do is reduce it. Then the question is, who's responsibility is it to make sure that these things are done well.

➕ show 1 reply

Animats • today at 7:05 AM

"AI safety", as defined here, has most of the problem that "fact checking" for social media had. Many of the same problems the "woke" concern about "microagressions" had. Most of the techniques used in advertising. Much of what passes for political discourse today has the same problems. It's somewhat convincing bullshit.

Should AIs be held to a higher standard than X/Twitter? Than Reddit? Than Fox News? What censorship is appropriate? And, yes, alignment is censorship.

Then there's the big problem of chatbots telling you what you seem to want to hear. This is an old problem. "Happy Talk", from South Pacific", is the entertainment version. "Wartime" by Paul Fussell, is the serious version.

As the article points out, a small percentage of the population is very vulnerable to certain types of misinformation. It may be the same fraction of the population that's vulnerable to cults. But maybe not. Cults have a group self-reinforcing mechanism and an agenda. Chatbots have neither. Worth studying.

The point here is that restrictions on chatbots strong enough to protect the vulnerable would close off most political and social discourse.

[1] https://www.youtube.com/watch?v=JXgmQDFhPjo

➕ show 2 replies

adampunk • today at 12:56 AM

>Why is mental-health crisis not a gating category, the kind where the conversation stops, full stop, and the user is routed to a human?

there aren't enough humans.

➕ show 2 replies

mbgerring • today at 1:57 AM

“AI safety” as it’s understood today is an entire faith-based belief system, incubated in a cult-like community with a high propensity for drug abuse and mental illness, over more than a decade.

The reason that real-world harms caused by AI can’t get a hearing in what is now the mainstream AI safety community is that these harms were never part of the core tenets of the cult.

Best of luck to anyone working on reality-based AI harm reduction, you have many hard battles in front of you.

photochemsyn • today at 1:44 AM

The ‘tobacco warning label’ approach sounds good but I’m not sure if it stopped that many people from smoking or was just a means for corporations to limit their liability. Corporate culture being what it is, having warnings like the following pop up every time a client opens an LLM app would not be that popular in the C-suite. Possible examples:

AI MENTAL SAFETY WARNING:

> This chatbot can sound caring, certain, and personal, but it is not a human and cannot protect your mental health. It may reinforce false beliefs, emotional dependence, suicidal thinking, manic plans, paranoia, or poor decisions. Do not use it as your therapist, only confidant, crisis counselor, doctor, lawyer, or source of reality-testing.

AI TECHNICAL SAFETY WARNING

> This AI may generate plausible but destructive technical instructions. Incorrect commands can erase data, expose secrets, compromise security, damage systems, or brick hardware. Never run commands you do not understand. Always verify AI-generated code, scripts, and shell commands before execution.

Now, if I’m running my own open-source model on my own hardware, I can’t really blame the model if I myself make bad decisions based on its advice - that’s like growing your own tobacco from seed in your garden, drying and curing it, then complaining about the health effects after you smoke it. If I give it agentic capabilities on my LAN without understanding the risks, same old story - with great power comes great responsibility.

wilg • today at 12:40 AM

> Why is mental-health crisis not a gating category, the kind where the conversation stops, full stop, and the user is routed to a human? This is one of many questions I can’t find concrete answers for.

I don't know if there are studies or concrete data either way, but it seems at least plausible that continuing the conversation could be more effective (read: saves more lives) than stopping it.

b65e8bee43c2ed0 • today at 1:53 AM

the big labs could crank up their (brand) safety dials to the point where their chatbots give GOODY-2 responses to everything beyond PG13, and guess what? there are a hundred other services available, built upon Chinese models 5-10% behind Western SOTA.

it is no longer 2023. let go of whatever delusions you might hold about unopenining this Pandora's box.

avazhi • today at 1:52 AM

If you are using LLMs for emotional support or social interactions, you’ve got personal problems and that isn’t on the LLM provider to babysit. Same with people who unironically pay for OnlyFans or whatever.

I don’t even work in tech and I detest the Facebook/Zuckerbergs of the world but it’s obnoxious and trite seeing tech companies get scapegoated for what are ultimately social and societal problems, not tech problems.

As a solution it’d prob make sense to start with how disconnected most modern families are in terms of support and accountability.

From ChatGPT to Instagram, tech companies follow the contours of how society already operates.

➕ show 1 reply

adamnemecek • today at 1:18 AM

Autodiff is preventing any meaningful discussion about safety, systems trained with autodiff cannot be made safe.

simonw • today at 12:39 AM

"There is no independent audit, no time series, no disclosed methodology, so we have no idea whether the real figure is higher, whether it is growing, or how it compares across the other frontier models, none of which publish equivalent data."

Tip for writers: aggressively filter out the "no X, no Y, no Z" pattern from your writing. Whether or not you used AI to help you write it's such a red flag now that you should be actively avoiding it in anything you publish.

➕ show 2 replies

alt Hacker News

The other half of AI safety

Comments