What the hell is a "safety score for violence"?

skrebbel • yesterday at 8:04 PM • 3 replies • view on HN

Replies

It's making sure AI condemns violence perpetuated by people without power and sanctifies violence of those who have it.

I asked an AI. I thought they would know.

What the hell is a "safety score for violence"?

A “safety score for violence” is usually a risk rating used by platforms, AI systems, or moderation tools to estimate how likely a piece of content is to involve or promote violence. It’s not a universal standard—different companies use their own versions—but the idea is similar everywhere.

What it measures

A safety score typically evaluates whether text, images, or videos contain things like:

Threats of violence (“I’m going to hurt someone.”) Instructions for harming people Glorifying violent acts Descriptions of physical harm or abuse Planning or encouraging attacks

0123456789ABCDE • yesterday at 8:37 PM

read here: https://deploymentsafety.openai.com/gpt-5-4-thinking/disallo...

alt Hacker News

Replies