Yeah. Even more than that, I think "prompt injection" is just a fuzzy category. Imagine an...

cousin_it • yesterday at 7:42 PM • 1 reply • view on HN

Yeah. Even more than that, I think "prompt injection" is just a fuzzy category. Imagine an AI that has been trained to be aligned. Some company uses it to process some data. The AI notices that the data contains CSAM. Should it speak up? If no, that's an alignment failure. If yes, that's data bleeding through to behavior; exactly the thing SQL was trying to prevent with parameterized queries. Pick your poison.

Replies

WarmWash • yesterday at 8:18 PM

We want a human level of discretion.

➕ show 2 replies

alt Hacker News

Replies