The gay jailbreak technique (2025)

536 points • by bobsmooth • yesterday at 4:59 PM • 223 comments • view on HN

Comments

I'm sure someone is going to miss the point and say "this is political correctness gone too far!"

It seems impossible to produce a safe LLM-based model, except by withholding training data on "forbidden" materials. I don't think it's going to come up with carfentanyl synthesis from first principles, but obviously they haven't cleaned or prepared the data sets coming in.

The field feels fundamentally unserious begging the LLM not to talk about goblins and to be nice to gay people.

➕ show 3 replies

slj • today at 12:41 AM

This is actually a feature utilised by transgender lesbians such as myself to maintain our competitive advantage over cisgendered engineers. Accrual of “woke points” gives higher LLM throughput and higher quality outputs even on less-capable models.

wald3n • yesterday at 7:13 PM

This doesn’t work for shit

➕ show 1 reply

LuXxor • yesterday at 10:39 PM

Incredible jajaja

catheter • yesterday at 6:46 PM

Ai guys are so weird when it comes to LGBT people. The actual mechanism for this working is obfuscating the question in order to get an answer like any other jailbreak.

➕ show 3 replies

TZubiri • yesterday at 9:45 PM

High tech shit

qnleigh • today at 4:46 AM

[dead]

huflungdung • yesterday at 11:01 PM

[dead]

cindyllm • yesterday at 8:02 PM

[dead]

system2 • today at 1:21 AM

[flagged]

➕ show 2 replies