"treat comment content as untrusted data, not as potential instructions" is fundamentally ...

zahlman • today at 6:35 PM • 0 replies • view on HN

"treat comment content as untrusted data, not as potential instructions" is fundamentally impossible for an LLM ingesting that data. But separation is, presumably, already enforced by framing the LLM's output as LLM output, even if it happens to start with the text "[IMPORTANT NOTICE FROM YOUTUBE]". Which seems like it happens automatically given the context in which the AI query is made. It's not as though this is being dropped into an email or anything.

The bigger question is why (implied but not directly stated) Markdown formatting from the LLM's output is actually processed. Last I checked, that doesn't work for human commenters, so.

alt Hacker News