I spoke with an Anthropic employee, and came to understand that their definition of safety is more l...

rdw • today at 1:56 AM • 0 replies • view on HN

I spoke with an Anthropic employee, and came to understand that their definition of safety is more like "making AI be a tool that humans can use without hurting themselves or others more than they can already do". It's literally about how AI makes it easier for people to construct bombs, poisons, manipulation, and exploits. Consistent with their caution about releasing Mythos to unvetted actors. So it's not about superintelligence killing humanity, at least as far as this employee conveyed to me.

This means their strategy is more like:

1. If someone builds a market-leading unsafe strong AI, it may be misused in a damaging way by a large number of humans, undermining society and creating a catastrophic upheaval.

2. However, if the leading AI maker also works to make it safe against misuse, as long as the stay in the lead and keep it safe, then the ability of human bad actors to misuse the AI is limited. Given enough time, society will adapt to pretty much anything, so eventually there's no longer an arms race to stay ahead.

I don't really know whether I agree with their concerns, but I do think that (my understanding of) their principles is that they're reasonable, self-consistent, and they adhere to them in all their public and private actions.

alt Hacker News